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Abstract 

We introduce a class of generalized relative entropies (inspired by the Bregman 
divergence in information theory) on the Wasserstein space over a weighted Rie- 
mannian or Finsler manifold. We prove that the convexity of all the entropies in 
this class is equivalent to the combination of the nonnegative weighted Ricci cur- 
vature and the convexity of another weight function used in the definition of the 
generalized relative entropies. This convexity condition corresponds to Lott and 
Villani's version of the curvature-dimension condition. As applications, we obtain 
appropriate variants of the Talagrand, HWI and logarithmic Sobolev inequalities, 
as well as the concentration of measures. We also investigate the gradient flow of 
our generalized relative entropy. 
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1 Introduction 

This is a continuation of our work jOTj on the displacement convexity of generalized en- 
tropies and its applications. We consider more general entropies than [OT] and generalize 
most results in appropriate ways. Some of our observation shall shed new light on |OTj . 

It has been known since the celebrated work of McCann [Mcl] that the convexity 
of an energy (entropy) functional along geodesies in the Wasserstein space plays a vital 
role in the study of the existence and the uniqueness of a ground state (a minimizer of 
the energy). Here the (quadratic) Wasserstein space over a complete separable metric 
space (X, d) is the space 'P^(X) of Borel probability measures on X having finite second 
moments, endowed with the Wasserstein distance function W2 derived from the Monge- 
Kantorovich mass transport problem (see Subsection 12. 2p . We say that a functional S 
on V'^{X) is displacement K -convex for G M (Hess S* > K for short) if any pair 
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fio, G V^{X) can be joined by a minimal geodesic (/it)iG[o,i] i'^'^i-^)^ ^2) such that 



^(/it) < (1 - t)^(/io) + tSifXi) - - t)tW^2(/io, 



holds for all t G [0, 1]. As usual, the displacement 0-convexity may be simply called the 
displacement convexity. The word 'displacement' is inserted for avoiding a confusion with 
the convexity along the linear interpolation S'((l—t)/io+t/ii) < {l—t)S{fio)+tS{fii). Since 
we deal with only the displacement convexity, we may sometimes omit 'displacement'. 

As any geodesic in the Wasserstein space is written as the transport along geodesies 
in the underlying metric space, the displacement convexity of an energy functional can 
be derived from the convexity of its generating function. For instance, let 5** be the free 
energy functional on P^(]R") consisting of the internal energy and the potential energy as 



for absolutely continuous probability measures /i on M"' with respect to the Lebesgue 
measure where the energy density u is a function on M and the potential \1/ is a 
function on M". Then is strictly displacement convex if u is convex (and satisfies 
certain additional conditions to be precise, see Definition 13. ip and \1/ is strictly convex, 
and the unique ground state u := cr£" satisfies u'{a) = — \E'+A with a normalizing constant 
A. We mention that the uniqueness is measured at the level of the energy functional, that 
is, we have 5'*(/i) — S^{h') > and equality holds if and only ii fi = u. Moreover, the 
displacement convexity of the free energy is a crucial tool also in the investigation of 
the asymptotic behavior of the solution to the associated evolution equation 



by regarding it as the gradient flow of S** in the Wasserstein space (see |JKO] . [AGSlj . 
[CMVlj and [CMV2j among others). In particular, the heat flow is regarded as the 
gradient flow of the relative entropy (with respect to the Lebesgue measure) 



which is also called the Kullback-Leibler divergence in information theory. 

On curved spaces such as Riemannian manifolds, the displacement convexity of energy 
functionals is related to the curvature of the underlying space, that is a crucial difference 
from the convexity along linear interpolations (1 — t)/io + t/ii. On a Riemannian manifold 
equipped with the Riemannian volume measure volg, the relative entropy is similarly 
defined by 



It has been shown by von Renesse and Sturm [vRSj (inspired by [CMSl] and |0V] ) that 
for any A' G M the following are mutually equivalent: 




dp 
dt 



div(pV[M'(p)] +pV*) 
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• The relative entropy Entvoig is displacement ii'-convex on (V"^ (M) , W2) ■ 

• The Ricci curvature is bounded from below by K as RiCg(v,v) > A'(v,v) for all 



• The heat flow is K-contractive in the sense that W2{fit, P^t) < e~^^W2{^o, fio) holds 
for alH > and for any weak solutions {pt)t>o, {pt)t>o to the heat equation dp/dt = 
Ap such that pt '■= Pt^o\g,flt := PiVol^ G V^{M). 

See also |AGSlj . |AGS3j . |Ohl] . [5a] and |Vi2[ Chapter 23] for the connection between 
the i^'-convexity of the functional and the i^-contraction property of its gradient flow. 

The displacement i^-convexity of Entvoig (Hess Entvoig > K) is called the curvature- 
dimension condition CD(if, 00) after Bakry and Emery's pioneering work |BEj . One 
remarkable point of CD(/r, 00) is that it can be formulated on general metric measure 
spaces without any different iable (manifold) structure. Such metric measure spaces with 
Ricci curvature bounded below are independently investigated by Sturm |St2] and Lott and 
Villani |LV2] . and known to enjoy several properties common to Riemannian manifolds 
of RiCg > K. For example, as was indicated by Otto and Villani |0V] . CD(i^, 00) with 
K > implies various functional inequalities such as the Talagrand inequality, the HWI 
inequality, the logarithmic Sobolev inequality and the global Poincare inequality {^N% 
Section 6]). 

The curvature-dimension condition CD(i^, 00) is generalized to CD(A', iV) for each 
A' G M and N G (1, 00]. On an n-dimensional complete connected Riemannian manifold 
{M,g) of n > 2 equipped with a weighted measure u = vo\g with / G C°°(M), the 
condition CD (A', A^) is known to be equivalent to the lower bound of the A^- Ricci curvature 
Ric7v(v,v) > A:(v,v) ([Sti], [St3], [OT], [W2], see Definition lO for the definition of 
RIcat). In particular, an unweighted Riemannian manifold (M, vol^) satisfies CD(A', A^) 
if and only if its Ricci curvature is bounded below by K and its dimension is bounded 
above by A^. We remark that Sturm's and Lott and Villani's definitions of the curvature- 
dimension condition are slightly different, though they are equivalent on non-branching 
spaces such as Riemannian or Finsler manifolds. In both cases it is a certain convexity 
condition of a class of entropies, and Lott and Villani's class is larger than Sturm's one. 

On non-branching metric measure spaces, the condition CD(0,A^) for A^ G [n,oo) is 
equivalent to the displacement convexity of the Renyi entropy 



For K ^ 0, however, CD(A', A^) is not simply the displacement A'-convexity of S^. In 
fact, it was shown in |Stlj (see also |0T[ Theorem 4.1, Remark 4.3(2)] and |BSj ) that, 
on a weighted Riemannian manifold {M,u), HessS'Ar > K can hold only for A" < and 
is equivalent to RIcat > regardless of the value of A' < 0. It was also observed in 
|Stlt Theorem 1.7] for unweighted Riemannian manifolds that there are some functionals 
whose displacement A'-convexity characterizes the combination of Ric > K and dim < A^, 
whereas it is unclear if there are any applications of these entropies. 

In our previous work jOTj . we introduced the m-relative entropy Hm for the pa- 
rameter m G {{n — 1)/?T,, 1) U (1, 00) inspired by the Bregman divergence in information 



V G TM. 
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theory/geometry (see |Amj . |ANj ) as well as the Tsallis entropy in statistical mechanics 
(see |Tslj . |Ts2] ). We fix a reference measure v = exp^(— \l/)ci; on a weighted Riemannian 
manifold {M,u) involving the m- exponential function 

exp„(t) := max{l + (m - l)t, O}^/^™"^), 

then the m- relative entropy of an absolutely continuous measure fi G 'P^(M) with respect 
to u is given by (up to an additive constant) 

H (fi) — ^ [ I f— V -m— f— dcu 
m(m — 1) ^\du J du\du J J 

This generalizes the relative and the Renyi entropies in the sense that limm-^i -f^m(/^) = 
EntM - 1 and that Hmi^i) = N{m-^SN{lJ') + 1} with iV = 1/(1 - m) if $ = (i.e., 
u = u). 

Then the displacement J^-convexity of Hm is equivalent to the combination of RIcat > 
(of (M, cj)) and Hess^' > K ( {0T\ Theorem 4.1]). We stress that becomes negative 
for m > 1, then Ricjv is defined in the same form as the case of G (n, oo) (see Defini- 
tion [2]T]). Similarly to CD(A', oo), we can derive from Hessi^m > A' > the associated 
functional inequalities (see also jAGK] . |CGH] . |Tal] for related works) and the concen- 
tration of measures (in terms of exp^). Furthermore, the gradient fiow of Hm produces 
weak solutions to the fast diffusion equation (m < 1) or the porous medium equation 
(m > 1) with drift of the form 

| = d,v„(lv(p")+pV*), 

where div^ is the divergence of {M,u) (see also [Ot] . jVi2| Theorem 23.19]). We remark 
that Sturm [Stlj studied a more general class of entropies on unweighted Riemannian 
manifolds, where RIcat = Ric for all A^. Compared to it, jOTj gave a detailed investigation 
of a concrete class of entropies, on more general weighted Riemannian manifolds (by 
choosing appropriate parameters A^). 

In this article, we introduce the more general class of entropies, called the ip -relative 
entropies H^, again inspired by information theory/geometry. Here (p : (0, oo) — >■ (0, oo) 
is a non-decreasing, positive, continuous function. Roughly speaking, our new class 
corresponds to Lott and Villani's class of entropies in their definition of the curvature- 
dimension condition, while the m-relative entropies in [OT] correspond to Sturm's class. 
The definition of if<^ (see Definition 15.31 for details) involves u = exp(^(— \1/) with the 
If -exponential function exp^ which is the inverse function of the ip -logarithmic function 
ln<^(t) := j^ip{s)^'^ ds. We recover exp,„ and Hm from ip{s) = s^""^. 

Our first main theorem (Theorem 15.71) asserts that Hess Hm > K is equivalent to 
HessiiT^ > K for all (p^s in a certain class. This actually corresponds to the equivalence 
between Sturm's and Lott and Villani's curvature-dimension conditions on weighted Rie- 
mannian manifolds. This reveals that Hm is an extremal element among if^'s in the 
appropriate class, see jTa2j for a related work. Similarly to Hm, we can derive from 
HessiJi^ > A' > the variants of the Talagrand, HWI, logarithmic Sobolev, and global 
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Poincare inequalities (Theorem I6.3P as well as the concentration of measures in terms 
of exp^ for some m = m{ip) (Theorem I7.9p . Moreover, the gradient flow of H^p in 
{V'^{M),W2) produces weak solutions to the ip-heat equation (Theorems 18.71 I^Tl) 

The article is organized as follows: We first review the basic notions of weighted Rie- 
mannian geometry, Wasserstein geometry and information geometry in Section [21 Then, 
after preparing necessary notions in Sections [3l HI we define and study its displacement 
convexity in Section |5l Section [6] is devoted to the functional inequalities and Section [7] 
is concerned with the concentration of measures. The gradient flow of is studied in 
Sections [HI El in the compact and noncompact cases, respectively. We extend most results 
to Finsler manifolds in Section [TDl Finally in Appendix, we compare our concentration of 
measures derived from the generalized Talagrand inequality with the Herbst-type argu- 
ment deriving the concentration from the M<^-entropy inequality, which is a generalization 
of the logarithmic Sobolev inequality different from ours. 



2 Preliminaries 

2.1 Weighted Riemannian manifolds 

Throughout the article except Section [TUl {M,g) will be an n-dimensional complete con- 
nected Riemannian manifold without boundary. As we are interested in the role of the 
curvature, we will always assume n >2. Denote by dg and vol^ the Riemannian distance 
function and the Riemannian volume measure of {M,g). We fix an arbitrary measure 



as our base measure. To control the behavior of cj, we modify the Ricci curvature RiCg of 
(M, g) as follows. 

Definition 2.1 (Weighted Ricci curvature) Given A^ G (— oo,0) U [n, oo], we define 
the N -Ricci curvature tensor of (M, uj) by 

RiCg -f- HesSg / if A^ = oo, 

Riciv := { Ricg + Hessg / - ^^'^ if A^ G (-cx), 0) U (n, cx)), 

Ricg + Hessg / - cx) ■ (D/ ® L>/) if A^ = n, 

where by convention oo ■ = 0. 

We set RicAr(v) := RicAr(v,v) and will say that Ric^r > K holds for some i^' G M if 
Ric7v(v) > K(v, v) for every v G TM. 

Remark 2.2 The tensor RIcat was usually considered only for A^ G oo], and then 
the monotonicity RicAr(v) < RicAr/(v) for N < N' clearly holds. Note that RiCoo is the 
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famous Bakry-Emery tensor and RicTv for G (n, oo) was introduced by Qian (see |BE] . 
Qi] and |Lo] as well). Extending the range of to (—00, 0) U [n, 00] violates the above 
monotonicity in A^, however, observe that RIcat is non- decreasing in the parameter 



m := 1 




00 



) 



where m := 1 if A^ = 00. 



This observation will be helpful for understanding the validity of Theorem 15.71 below. 

Note that, if (M, w) satisfies RIctv > K for some G M and A^ G [n, 00), then 
it behaves like a Riemannian manifold with dimension bounded above by A^ and Ricci 
curvature bounded below by K (see [Qi], [Lo], as well as [St2], [St3], jLVT] . jLV2] . jVi2l 
Part III] related to the curvature-dimension condition). For example, the following area 
growth inequality of Bishop type (numerically extended to non-integer A^'s) holds. Denote 
by area^[S'(xo, r)] the area of the sphere S'(xo,r) := {x G M | dg{xQ,x) = r} with respect 
to u. 

Theorem 2.3 ([Qi], [Stll Theorem 2.3]) // (M,w) satisfies Ricjv > for some N G 
[n, 00), then 



holds for any < r < R and xq G M . 

For A^ = 00, we have the following global estimate. 

Theorem 2.4 ( |Vi2t Theorem 18.12]) Under the nonnegativity o/RiCoo of {M,u), 



holds for any A > and xq G M. 

Though Theorems 12.31 12.41 are generalized to Ric^r > K for K ^ 0, we will need only 
the above special cases. 

2.2 Wasserstein geometry 

Let us recall some basic notions and facts in optimal transport theory and Wasserstein 
geometry. See |AGSlj . [Vilj and |Vi2j for details and more information. 

Let (X, d) be a metric space. A rectifiable curve 7 : [0, 1] — > X is called a geodesic 
if it is locally minimizing and has a constant speed. We say that 7 is minimal if it is 
globally minimizing, namely (i(7(s), 7(t)) = |s — t|(i(7(0), 7(1)) holds for all s,t G [0,1]. 
A subset y of X is said to be totally convex if, for any x, ?/ G y , any minimal geodesic in 
X from X to y is contained in Y . 

For a complete Riemannian manifold {M,g), let V{M) be the set of all Borel prob- 
ability measures on M. Given /i G V{M) and a measurable map T : M — > M, the 
push forward measure T^fi of /i through T is defined by T^fi[B] := fi[T^^{B)] for all Borel 
sets B C M. For each p G [l,oo), denote by V^{M) C V{M) the subset consisting of 
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measures /i of finite p-th moments, that is, Jj^j dg{xo,xy d^{x) < oo for some (and hence 
all) xo e M. 

For /U, G V{M), a probability measure vr G V{M x M) is called a coupling of /i and 
if its projections are fi and z/, namely n[B x M] = fi[B] and 7r[M x 5] = hold for 
any Borel set B C M. We define the L^-Wasserstein distance between E V^{M) by 



_ i/p 

I/) := inf <j ( / dg{x,yYd7i{x,y) 




MxM 



tt: a coupling of /i and 



A coupling TT is said to be optimal if it attains the infimum above. The function Wp is 
indeed a distance function on V^{M). The metric space ('P^(M), IVp) is complete, separa- 
ble and called the L^-Wasserstein space over M. The Wasserstein space inherits several 
properties of M. For instance, if M is compact, then {V^{M),Wp) is also compact and 
the topology induced from Wp coincides with the weak topology. We will mainly consider 
the quadratic case p = 2, and then we omit 'L^-' and simply call W2 and {V'^{M),W2) 
the Wasserstein distance function and the Wasserstein space. 

In view of optimal transport theory, W2{f^o, fJ^iY is regarded as the least cost of trans- 
porting yUo to yUi, where the cost of transporting a unit mass from x to ?/ is dg{x,yy. A 
minimal geodesic {fit)t&[o,i] with respect to W2 is then also called the optimal transport 
from fiQ to fii, and it can be described by using a family of minimal geodesies in the 
underlying space M. We denote by r(M) the set of all minimal geodesies 7 : [0, 1] — > M 
endowed with the uniform topology induced from the distance function dr(M)i'^,v) •= 
suptg[o,i] dg{pf{t),rj{t)). For t G [0, 1], the evaluation map evt : r(M) — > M is defined by 
evf(7) := 7(t), which is clearly 1-Lipschitz. 

Proposition 2.5 ( |LV2t Proposition 2.10], |Vi2t Corollary 7.22]) Given any minimal 
geodesic [iit)t£[o,i] C 'P^(M), there exists 11 G V{T{M)) such that (evt)(jn = fit for all 
t G [0, 1] and that (evg x evij^Ii is an optimal coupling of Hq and /ii. 

In particular, for any totally convex set X of (M, dg), V'^{X) is also totally convex in 
{V\M),W2). 

If one of yUo and /ii is absolutely continuous with respect to vol^, then a more precise 
description of a minimal geodesic {iit)t£[o,i] is obtained via the gradient vector field of a 
locally semi-convex function (i.e., every point x G M admits a neighborhood on which 
<p is i^'-convex in the weak sense for some i^' G M, see Definition I4.ip . For a measure v on 
M, we denote by VaciM,^) C V{M) the subset of absolutely continuous measures with 
respect to v. We also set Vl^{M, v) := V'^{M) n V^{M, v). 

Theorem 2.6 ([FGl Theorem 1]) Given any fiQ e 'PlciM,yo\g) and fii G V^{M), there 
exists a locally semi-convex function (p : Q — y M on an open set Q G M with /io[^] = 1 
such that the map Tt{x) := exp^(tV0(x)), t G [0,1], provides a unique minimal geodesic 
from jiQ to Hi. Precisely, (To x TiI'^Hq is a unique optimal coupling of Ho and fii, and 
fit '■= (7t)tt/io ^-5 a unique minimal geodesic from /iq to fii with respect to W2- 

If M is compact, then the above theorem is due to McCann's celebrated work |Mc2j 
and we can take as the potential function —(f) a c-concave function for the cost c{x, y) = 
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dg{x, y)^/2. (We do not give the definition of the c-concave function, what we need is only 
the fact that c-concave functions are locally semi-convex.) A locally semi-convex function 
is locally Lipschitz and twice differentiable almost everywhere by the Alexandrov-Bangert 
theorem. Thus 7j is differentiable fiQ-a..e. and the following Jacobian (or Monge-Ampere) 
equation holds. 

Theorem 2.7 (|Vi2l Theorems 8.7, 11.1]) Under the same assumptions as Theorem \2.6\ 
above, we have fit G '^Id^^^'^^g) f^f o,ll t G [0, 1). Moreover, by putting 

ptco := fit = {rt)ifi, 3t{x) := e^(^)-^C^(^))det(D7;(a;)) , 

we have pt(7t(x)) J^(x) = po{x) and J^(a;) > for all t G [0, 1) at fiQ-a.e. x G fi. In the 
case of u & ^^^(M, volg), the above assertions hold also at t = 1. 

Note that should be understood as the Jacobian with respect to u, and its behavior 
is naturally controlled by the weighted Ricci curvature. This is a fundamental geometric 
intuition behind the curvature-dimension condition (see Section [5]). 



2.3 Information geometry 

We briefly summarize some notions in information geometry associated with a non- 
decreasing, positive, continuous function ip : (0, oo) — )■ (0,oo). We refer to |Nal] and 
|Na2] for further discussion. 

We define the (f -logarithmic function on (0, oo) by 

/"* 1 

In^(t) := / —^ds, 

which is clearly strictly increasing. We will denote by and the infimum and the 
supremum of In^, that is, 

l^ := inf In^(t) = limln<^(t) G [— oo,0), := supln^(t) = limln^(t) G (0, oo]. 

t>0 t^co 

The inverse function of In^ is called the ip- exponential function. We extend it to the 
function on M as 

f if r < 

exp^(r) := < \n~\r) if r G (/^, L^) , 
[oo if r > L^. 

We also introduce the strictly convex function 

u^ir) ■■= / \n^{t)dt, rG[0,oo), 
Jo 

provided that it is well-defined (i.e., In^ is integrable on (0, 1)). 
Lemma 2.8 The function is well-defined if 

inf <! (5 G 



^1+5 



IS bounded on (0, 1) > < 1. (2.1) 
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Proof. As ip is positive and non- decreasing, it suffices to see u^{l) > — oo. We deduce 
from the hypothesis that s/ip{s) is integrable on (0, 1). This shows the claim since 

= — / / — — dsdt = — [ — — ds > — oo. 

Jo Jt vis) Jo vis) 

□ 

Entropy is a function measuring the uncertainty of an event, and the divergence in 
information theory is a quantity expressing the difference between a pair of probabihty 
measures. In this spirit, the ip-entropy for pu G Va.ciM,u) is defined by 

E^{puj) := - u^ip) du 
J M 

(provided that it is well-defined). Then we define the Bregman divergence between 
puj,auj e VaciM,u) by 



D^{pu\au) := / {u^{p) - u^{a) - u'^{a){p - a)} du. (2.2) 

The strict convexity of u^^ guarantees D^{puj\auj) > unless p = a cj-a.e.. Further- 
more, the square root of the divergence D^p satisfies a generalized Pythagorean theorem 
(see [OWl Proposition 3]) and hence it can be regarded as a kind of distance function, 
though it is not symmetric (i.e., D^{puj\auj) ^ Dip{auj\puj) in general). 
We define three more quantities measuring the order of ip for later use: 

{ s ip{s + t)-^{s) 

dtp '■= sup •{ ■ limsup 



vis) 40 i 

jr ■ f J ^ r vis + t)- ifiis) 

0^ := mt < —-— ■ hmsup 

' vis) 40 i 



s>0|g[0,oo], (2.3) 
s>oIg[0,oo), (2.4) 



l)-i if^^^l. 



N,:={'' ' ' (2.5) 

oo if t^^ = 1. 

The following lemma will be useful. 

Lemma 2.9 The function s^^/(p[s) is non-increasing in s E (0, oo). Moreover, if 6^ is 
finite, then the function s^'^ /(f{s) is non- decreasing in s & (0, oo). 

Proof. Assume O^p < oo (which will not play any role in the discussion on S^,), and fix 
s > and small s > 0. By the definitions of 6^ and 6^, there exists r£(s) > such that 

s ip{s + t) - ip{s) 



< ■ sup <0^ + -. 

vis) tmnis)] t 2 



Consider the functions 



6 (l + r)^^+^-l + 
h,(T):=9^ + - , r := 
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for r > 0. Since and /i_ are continuous and satisfy 

lim/i+(r) = + i - (^^ + £) < 0, lim/i_(r) = - (5^ - e) > 0, 

there exists > such that we have /i+(r) < and /i_(r) > for all r G (0, Tg). 
Given any t G (0, min{r£(s), STe}), we have 



if{s) if{s + t) Lp{s + 1) \lp{s) t ts ^ 

< -^^—-Mts^') < 0. 
<f{s + t) 

As s > was arbitrary, this shows that s^'^^^ /ip{s) is strictly increasing in s > 0. We 
similarly obtain 

so that s^^~'^ /ip[s) is strictly decreasing. Letting e 4 0? "we complete the proof. □ 

Remark 2.10 The function will be sometimes normalized so as to satisfy <^{1) = 1. 
This costs no generality as we easily see the following relations for any a > 0: 



lua^it) = a Mn^(t), exp^<^(r) = exp<^(ar), Ua^{r) = a ^u^{r 

^aipi^^^ ^ ^Lp^ ^aip ^ -^</?5 ^aip ^t/P? ^ai^ ^^p^ ^a^p 



2.4 Information geometry continued: The case of ^m{s) = s 

In |0T] . we considered the power function ipm{s) := s^~^ for m G (0,2] and the cor- 
responding m-logarithmic and m- exponential functions. (We have actually considered 
m G [{n — l)/n, oo) in |0T] . but (pm is non-decreasing only when m < 2.) We summarize 
several facts in this especially important case. For brevity, we set 

(A) In the case of <^i{s) = s, ii and ei coincide with the usual logarithmic and 
exponential functions, respectively. Thus we find li = — oo and Li = oo. We can 
easily observe = 1 and A^i = oo as well. For pu),au) G Vac{M,uj), we deduce from 
M^^(r) = rlnr — r that 

E^^{puj) = — / p\n.pduj + l, D^^{puj\auj) = / p\n — du. 

Jm Jm cr 

Namely E^-^ is the Boltzmann entropy up to adding 1, and D^p^ is the Kullhack-Leihler 
divergence. By choosing a = 1 formally in the definition of D^p-^{pu\o'u), the relative 
entropy of G 'P^(M) with respect to u is defined by 



J lim / plnpdu if /i = G P^,(M, w), 
Ent^(/i) := <j eiQ J{^>g| 

[ oo otherwise. 



(2.6) 
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In other words, the relative entropy is '( — l)x the Boltzmann entropy.' 

(B) For iprn{s) = s^~™ with m G (0, 1) U (1, 2], 1^ and are given by power functions 

as 

j-m—l 1 

^™(^) = rn-1 ' "'"^^^ = [1 + (m - ^f^^-'K 
where we set := max{t, 0} and by convention 0" := oo for a < 0. Observe 

-oo ifm<l, if^<i 

1 L^ = {l-m ' (2.7) 

iirn>l, if m > 1, 



m — 1 

9m = — fn and A^j^ = (1 — m)"^. As u^^{r) = (r"^ — mr)/{m(m — 1)}, the v^^-entropy 
for pu G Va.c{M, u) is given by 

[ 1 

Up to additive and multiphcative constants, this coincides with the Renyi{-Tsallis) en- 
tropy 

Sr,ipu):=~ [ p(^-i)/^rfa; (2.8) 



JM 

with = Nm, which is apphed to complex (strongly correlated) systems. The Bregman 
divergence between pw, au G Vac{M, u) is given by 

D^JpcoWco) = . ^ ^. / [(p™ - a™) - ma-~\p - a)] du. 



M 



This coincides with the (3 -divergence, whose strength is its robustness. For instance, we 
refer to [MTKEj for the roles and the differences of statistical divergences including the 
Bregman divergences. Note that as m — )■ 1 we have 

imit) ii{t), e^(r) ei(t), E^^{puj) E^^{puj), D^^{puj\auj) D^^{puj\au). 

The function ip^ = s^"™ is an extremal element among those ip^s satisfying 6^ = 2 — m 
in several respects, as one can see in the next useful lemma for instance. 

Lemma 2.11 Assume d^p < 2 and put m = 2 — 9^. Then for any t > and r G M we 
have 

^ it) < In^(t) < ^im{t), (2.9) 



^(1)— -"--^-^(t) 

exp^(r)<e^(^(l)r). (2.10) 

Proof. It follows from Lemma [2.91 that, for any t > 0, 

s^^^ ds < / — - ds < — - / ds. 



<^(1) Ji Ji ^(s) Ji 
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This is exactly (12 .Qp since B^p = 2 — m. 

As for (12.101) . the assertion for r < is trivial since exp^(r) = by definition. If r > 
L^, then we deduce from (12.91) that f{l)r > ip{l)L^ > Lm, which shows em{f{i)r) = oo. 
We therefore assume < r < and set t := exp^{r) > 0. Then we obtain again from 
([22D that 

exp^(r) = t = em{^m{t)) < em{(p{l) ln<p(t)) = em{(p{l)r). 

□ 

Taking the limits as t J, or t f oo in (12. 9p . we obtain from (12.71) the following. 
Lemma 2.12 Suppose 6^ < 2. 

(i) If 6^ < 1, then l^p > — oo {equivalently, if 1^^, = — oo, then 6^>1). 

(ii) If 0^ < 1, then = oo {equivalently, if < oo, then 9^ > 1). 

We similarly find the corresponding estimates concerning 6^. Note that 6^^ = Om = 
2 — m. 

Lemma 2.13 Assume 5^ < 2 and put m = 2 — d^p. Then for any t > and r G M we 

have ^ 

^^m{t) < < ^^-(^). exp^(0 > e^(<^(l)r). 

In particular, 

(i) If > 1, then < oo {equivalently, if L^p = oo, then 6^ < 1). 

(ii) If > 1, then l^ = —oo {equivalently, if l^p > —oo, then 5^ < 1). 



3 Displacement convexity classes VC 



N 



In this section, we introduce the important classes of convex functions. These classes were 
first considered by McCann jMcT] for iV > 1 (see also §5.1], |ViIl Section 5.2], [Vi2l 

Chapter 16]), we adopt the same definition also for < 0. 

Definition 3.1 (Displacement convexity classes) For G (— oo,0) U [l,oo), we de- 
fine VCn as the set of all continuous convex functions u : [0, oo) — )■ M such that u{0) = 
and that the function 

^jJJ^{r) := r^u{r~'') 

is convex on (0, oo). In a similar way, VCoo is defined as the set of all continuous convex 
functions u : [0, oo) — > M such that u{0) = and that the function 

i^oo{r) := e''u{e~'') 

is convex on R. 

The following is well-known for A^ > 1, we give a proof for completeness. 
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Lemma 3.2 If u & T^Cjsi for N G [l,C)o] {resp. N G (— oo,0)), then the function ipiq is 
non-increasing {resp. non- decreasing) . 

Proof. For G [1, oo) and < s < t, the convexity of u and m(0) = yield 



i^^it) = t^'uit-^') < t^'^[l - ^Jn(O) + ^u{s-^) \ = ^^{s) 
We similarly obtain for N = oo and s, t G M with s < t that 



^oo(t) = e'uie-') < e*| (^1 - - )m(0) + -^(e-^) [> = ij^{s). 
Finally, for iV < and < s < t, it holds 



e 



□ 



It is also known that VCn' C VCn for 1 < < A^' < oo. This monotonicity in is 
violated by extending to A^ < 0, but the monotonicity inm = (A^ — 1)/A^G [0,oo) holds 
instead. Compare this with the monotonicity of RIcat in m (Remark 12. 2p . 

Lemma 3.3 For each N,N' G (— oo,0) U [l,oo] with m < m' , we have VCn' C 'DCn, 
where we set m = {N — 1)/N, m' = {N' — l)/N' and m = 1 if N = oo {resp. m' = 1 if 
N' = oo). 

Proof. We first consider the case of < m < m' < 1 (equivalently, 1 < A^ < A^' < oo). 
For any u G VCm' and r > 0, we observe 

Mr) = r^n(r-^) = (r^/^y n((r^/^')"^') = V'iV' (r^^"^' ) • 

This is convex in r since the function r H- r^/^' is concave and ip^' is convex and non- 
increasing. Thus u G VCn and hence VCn' C VC^. 

The other cases are similar. For 1 < m < m', we have A^ < A^' < so that r h-> r^/^' 
is convex and ip^' is non- decreasing. When m' = 1 > m, ipNi'f') = '^oo(^logr) holds and 
note that r i— )■ A^logr is concave and ipoo is non- increasing. For m = 1 < m', we find 
V'oo('") = ^N'{e^^^ ) and that r i— )■ e''/^ is convex and is non- decreasing. □ 

We shall write down a condition for G VCn on As is continuous, convex and 
satisfies u^{0) = by definition once it is well-defined, it is sufficient to check (12.1 p and 
the convexity of ipN- 

Proposition 3.4 Assume that ip satisfies the condition (12.10 . Then the function ip^ for 
N G (— oo, 0) U (1, oo] is convex if and only if 

holds for all t > 0, where N/{N - 1) = 1 if N = oo. 
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Proof. We first of all recall that is well-defined thanks to (12. ip (Lemma I2.8p . For 
N G (—00, 0) U (1, 00) and r > 0, we calculate 

Note that N{N - 1) > 0. For any t > 0, we have 

N t"^ /•* N t"^ 



u^{t) - tln^(t) + 



asar + — — - — -— = — — -— as + 



Therefore ip'^ > if and only if f l3.ip holds. For = 00, we similarly obtain 



for r G M, and 

for t > 0. □ 



/"* s 

uJt)-t\n.Jt)^ — = - —^(is + 

Jo 



Theorem 3.5 IfO^p < 2, then the condition i\2.1\) holds and we have u^p G VC^^. 
Proof. We deduce from Lemma [2.91 that 

o<^< 



ip{s) ip{l) 

for all s G (0, 1), this implies (12. ip since 9^ < 2. Lemma [23] also yields that, for any t > 0, 

ds < / —-s^~^^ ds 







This is nothing but (13. ip with = A'^^ (recall the definition of A'^^ in (12. 5p ). and hence 
G VCn^ by Proposition 13. 4[ □ 

Recall that '^rn{s) = s^"™ with m G (0, 2] satisfies 9m = 2 — m < 2. Hence Theorem l3.5l 
shows G VCNm- We close the section with a partial converse of Theorem 13.51 

Proposition 3.6 // the condition (12.11) holds, < 2 and if we have G T>Cj^ with 
some N G (-00, 0) U {I, 00], then it holds 6^ < {N + 1)/N [where {N +1)/N = 1 for 
N = oo). 
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Proof. Lemma [2.91 with Proposition 13.41 yields that, for any t > 0, 

N e /■* s , /■* t''*' 1 A , I 

ITr 7^T > / -^ds> / — "(13=- ; --, 

N - 1 ^{t) - J, ^{s) - Jo vit) 2 - 6^ v{t) 

which shows 6^, < {N + 1)/A^ as desired. □ 
In particular, for m G (0, 2], u^^ G VCn if and only if m > (A^ — 1)/A^. 

4 Admissible spaces 

This section is devoted to introducing the class of spaces admissible in our consideration. 
Recall our weighted Riemannian manifold (M, u) and a function (f as in Subsection 12.31 
From here on, we further fix the reference measure 

u = au := exp^(— \l/)ci;, 

where \l/ G C{M) such that 

* > -L^ on M, := ((-L^, -Q) ^ 0. (4.1) 

Note that supp u = M* ^ 0. For later convenience, let us define the ii'-convexity of a 
function on a general metric space. 

Definition 4.1 (/C-convexity) Given G M, we say that a function \E' : X — > 
(— oo, oo] on a metric space {X, d) is K-convex in the weak sense (denoted by Hess ^! > K 
by slight abuse of notation) if it is not identically — oo and, for any two points x,y & X, 
there exists a minimal geodesic 7 : [0, 1] — > X from x to y along which 

^(7(t)) < (1 - mix) + t^{y) - f (1 - t)td{x, yf (4.2) 

holds for all t G [0,1]. 

We remark that, on a Riemannian manifold M, (14. 2 p certainly holds for any minimal 
geodesic 7 : [0, 1] — )■ M by approximation. Indeed, 7|[£,i_e] is a unique minimal geodesic 
for any e > and ^ is continuous. We are interested in the situation that Ric^v^ > as 
well as Hess\i/ > K hold (see Theorem 15. 7p . Finer analysis is possible in the particular 
case of > (Sections El [Z])- We prove a lemma in such a case for later use. The open 
ball of center x G M and radius r > will be denoted by B{x, r). 

Lemma 4.2 Suppose that (p{l) = 1 [this costs no generality, see Remark U. 10\i . Hess \E' > 
K for some K > 0, and take a minimizer Xo E M of . 

(i) If > —00, then the set M* as in f l4.ip is totally convex and M* C B{xo,R) 

holds with R = a/— 2(/^ + '^{xq))/K. Moreover, supp v is also totally convex and 
compact. 
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(ii) If = — oo, A^^ e [n, oo) and x/Ric^v^ > 0, then we have = M , v[M] < oo and 

I dg{xo, xfaixy du{x) < Civ[MY + Csi^"''^^ < oo 
Jm 

for any a G (1/2, 1] andp G [0, oo) satisfying (2a — 1)A^<^ —p > 0, where Ci = Ci{uj) 
and C2 = C2{a,p,6^p,u) . In particular, a G L°-{M,uj) for all a G (1/2,1] and 
iy[M]-^ ■ V G V^{M,uj) for all p E [0,N^). 

(iii) If = 00 and RicAr_^ > 0, then a G L"-{M,u) for any a > 0. 

Proof. We first remark that the assumption Hess > K > guarantees the unique 
existence of a point Xq G M* such that "^{xq) = infjv/ We deduce from the i^'-convexity 
(USD that 

vl/(7(l)) >v[/(a;o) + |rf,(xo,7(l))' (4.3) 

holds for all minimal geodesies 7 : [0, 1] — > M with 7(0) = Xq. 

(i) For any minimal geodesic 7 : [0, 1] — )■ M connecting two points x,y E , we 
have 

^(7(t)) < (1 - t)^{x) + t^{y) - y (1 - t)tdg{x, yf < 

so that 7 is contained in M*. The total convexity of suppz/ can be seen similarly. Pre- 
cisely, for any x,y E \E'~^((— L^, — /^]), 7 as above satisfies 7((0, 1)) C M*. This in fact 
implies that suppz/ = = \E'~^((— L<^, —lip]) and is totally convex. We moreover obtain 
M* C B{xo,R) from (14. 3p . and thus suppi/ is compact. 

(ii) The first assertion M* = M is obvious by definition (see fl4.ip ). Note that E 
[n, 00) implies 6^ E (1, {n + l)/n] (see ( 1^ ). Set m := 2 - 6^^ < 1 and take a E (1/2, 1] 
and p > satisfying (2a - 1)A^^ -p>0. Then f l210|) and (gj]) imply 

dg{xo, x)^cr(x)" (ici;(x) 

M 

< / a''c/w+ [ e„ ( -*(xo) - ) area^[5(a;o, r)] rfr. 

JB(xo,l) Jl \ 2 / 

We mention that m = (A''^^ — l)/N^ and, for s < and t < 0, we have 

em{s + t) = {em{sr~' + (m - 1)^}"^-. 
Thanks to the hypothesis RIcat^ > with A^;^ G [n, 00), we can estimate the second term 
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by Theorem 12.31 as 

em f -^(xo) - yrM eiTesi^[S{xo,r)]dr 

|e„(- ^(xo))™"' + (1 - m)-r^\ rP+''-~Ur 

|e^(- ^(Xo))""V-2 + (1 _ ^-(2a-l)A.,+p-l 

< area^[^(xo, 1)] |(1 - m)y j y ^"{2a-i)iv^+P-i 



(2a - l)Ar^ - p 



area^[S'(xo, 1)] < (1 — m 



We used the condition (2a — 1)A^<^ — p > in the last equahty. Choosing a = 1 and p = 0, 
we in particular find z/[M] < oo. Then the Holder inequality yields that 



f a''du<(f adcu] uj[B{xo,l)]^''' <iy[M]''uj[B{xo,l)] 

JB{xo,l) \Jb{xo,1) J 



l-a 



Thus choosing 



Ci := max{uj[B{xo, 1)], 1} > u[B{xo, 1)] 



l-a 



1 / 2 ^ "^"^ 



C2 := TT^TT area^[^(xo, 1)] 

(2a- l)iV^-j9 VI - my 

gives the desired estimate. 

(iii) Combining (14. 3 p with (I2.10p provides, as 6'^ = 1, 

f a^x^ duj{x) < I exp^ I — \E'(xo) — —dg{xQ,xY\ duj{x) 

< exp (— a\l/(xo)) J exp ^ — —dg{xo,xY^ duj{x). 

Hence the assertion follows from Theorem 12.41 □ 

Now we introduce the conditions for a quadruple {M,u, ip, to be admissible in our 
consideration. 

Definition 4.3 (Admissibility) We say that a quadruple (M, w, <y9, is admissible if 
all the following conditions hold: 

(A-1) (^(1) = 1. 

(A-2) G (-00,-1] U [ra, 00] and 7^ 2 or, equivalently, 9^ G [0,(n + l)/n] and 
< 3/2. 

(A-3) ^ > -L2^e^ on M and M* = ^''{{-L^, -Q) ^ 0. 
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(A-4) h^{a) G L^{M,u) and a\n^{a) G L^{M,u), where := if = oo and 
h^{r) := u^pir) — rL^ if < oo (see fl5.3p below). 

We mention that < and < i^i^ hold with m = 2 — by fl2.9p . and hence 
C by (A-13]). The first condition (p{l) = 1 is merely the normalization (see 

Remark 12 lOp . and (A-H]) is imposed for u being adopted as a reference measure of the 
Bregman divergence (see (12. 2p ). The next lemma ensures that (A-Hj) automatically holds 
if RicAT^ > and Hess > K for some K > 0. 

Lemma 4.4 Suppose that {M,u,(p,'^) satisfies (A-d]), (A-12]), ^ > -L^ on M, M* 7^ 0, 
Riciv^ > and Hess\& > K for some K > 0. Then (A-H]) also holds. 

Proof. The case of > —00 is clear due to Lemma H^ i). so that we assume = —00 
and then 9^, > 1 (Proposition l2.12T i)). Observe also that u^{a) G L^{M,oj) implies 
hip{a) G L^{M,uj) since z/[M] < 00 by Lemma [4.2( ii). (iii). Let xq be the minimizer of \i/ 



and set R := y max{l, — 2^(xo)}/A'. Note that the i^-convexity of \E' (14. 3 p guarantees 
that, on M\B{xo,R), 

0<a = exp^(-^) < exp^ (^-^(xq) - yi?'^ < exp^(O) = 1. 

We first consider the case of 6';^ > L On M \ B{xo, R), (12. 9p implies that 



\cT\n^{a)\ = -a\n^{a) < -a£2^e^{a) = N^{a^-''- - a) 



u,Aa]\ = - I ln^{t) dt< N^it}^'^^ -l)dt = N^\ - a 



2-9^ 



Thus we have 



aln^{a)\duj < / \aln^{a) \ dcu + / N^{a^ ^^-a)du, 

AI Jb{xo,R) Jm\B{xo,R) 

u^{a)\duj< I \u^{a)\duj+ I { a] du. 



M JB{xo,R) JM\B{xo,R) \2 — 9^ 

As 2 - G (1/2, 1) by < 3/2, Lemma |12](ii) ensures u<^(a), aln^(a) G L^{M,uj) 
In the case of = 1, we similarly have on M \ -B(xo, R) 

\a\n.^{a)\ < —a\na < \fo, \u^{cj')\ = — / In^(t) (it < / —(it = 2^0"- 

io Jo vt 



Then the claim follows from Lemma [4.2( iii). □ 
We close the section with an auxiliary lemma on how to normalize u when K > 0. 

Lemma 4.5 Let (M, u, (p, be admissible, Ric^v,^ > and Hess > K for some K > 0, 
and set I = {l,L) := {l^ + iniM , L,^ + infM\i/). We in addition assume that \i/ is 
differentiate at the minimizer of if = n. Then there exists some A G / such that 
exp^iX - ^)uj eV^M, 00). 
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Proof. We first remark that inf a-/ ^ > — oo, and that for any \ E I 

^ - A > inf ^' - (l^ + inf = -L^ 
as well as Hess(\E' — A) = Hess ^ > K hold. Thus 

S(A) := / exp^(A - \E') rfw < oo 
J M 

by Lemma 14.21 Since S is non-decreasing and continuous on I by Lebesgue's dominated 
convergence theorem (or the monotone convergence theorem), we are done if lim;v|; S(A) < 
1 < limA|LH(A) holds. We also deduce from the dominated convergence theorem that 
\\m\]^i S(A) = 0. If = oo, then we find limA|oo S(A) = oo by the monotone convergence 
theorem. 

The rest is to prove \\m\^i'E{X) > 1 when L<^ < oo. Note that < oo implies 
lim^-i-oo v^(s) = oo by definition, and 9y, > 1 (i.e., A''^^ G [n, oo)) by Proposition 12 . 1 2( ii) . Let 
xq E M he the unique minimizer of \1/ and take i?o > such that B{xq, Rq) C M* and that 
B{xo,Ro) contains no cut point of Xq. Then, for any x G S{xo,r) with < r < R < Rq, 
the /C-convexity of \E' provides 

"^{x) < (l - ^) ^i^o) + ^ sup * - Y (l - ^) = ^(a^o) + yr^ + ar, (4.4) 

where we set 

a = a{R) := 4 I sup ^ - ^(xq) - ^R" ] > 0. 
^ \sixo,K) 2 y 

Observe that limfi^Qa{R) < oo by the i^-convexity of \E', and that lim/?^o '^(-R) = holds 
if N^p = n since \1/ is assumed to be different iable at xq. In both cases (A'^^ > n or N^p = n) 
we can choose i? G (0, i?o] small enough to satisfy 

^d2 , D/r o ^ areaa;[^(xo,i?)] y^^^ 

yi? + ai? < L^, 2a < n^rn.-^ J " (^•^) 



Then take large A G / such that 



K 

A > ^'(xo) + —R^ + aR. 



2 



Set 



K 



e^(r) := exp^ ( A - ^(xq) - — - ar 



and note that it is decreasing. 

We deduce from Theorem 12.31 and fl4.4p that 



(A) > / exp^{X - duj > / area^[S'(xo,r)]e^(r) 
Jb(xo,r) Jo 



'B{xo,R) 

^ area^^(xo,i?)] y ^7v,-i^a^^^ 



area4g(xo,i?)] /i^^-.A.p^ f uxy. 
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The concavity of In^ yields 



K(t)<K(ei(0)) + ^^, 



and hence it holds for t := e^(r) that 



/ / e^(0)—t 

Kr = -a+ y^a2 ^ 2K {ln^(e^(0)) - In^(t)} > -a + J + 2K^^-^. 

By the change of variables formula for t = e^(r), we have 

r^-(ej)'(r)c/r=— ^ / -a + y^a^ + {ln^(e^(0)) - Mt)} dt 

















J^iR) 



Combining the triangle inequality 



-a+ Ja^ + 2K 7 ^ 



with Jensen's inequality for the convex function s i— )■ (-y/a^ + s)^*^ (s > 0), we deduce that 



2K e^(0) - 1 

^{^)JeUR) e^(0)-e^(i?)' 



> {ei(0) - ej(i?)} <lJa2 + -^737^ / ...^ \..on ^^ - « 
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Hence we obtain, as (f{s) < s^"^ for s > 1 by Lemma [2.91 



limS(A)>^^^^#^lim 



areata [5* (xo,-R)] 
_ area^[S'(xo, R)] 



Ks 



s + e^{R)y^ 



area^ [S{xo,R)] 



N^R^^-^K^f 40 



lim 



+ A't 



+ Ks^-^-^ - a) 



a 



t 



area^[5(a;o,i?)]_^2a)-^^ 



This is bigger than 1 by the choice of R (recall (14. 5p ) and we complete the proof. 



□ 



Remark 4.6 If RIcat^ > and Hess\E' > K for some > 0, then Lemma 14.21 yields 
v[M] < oo. We will sometimes normalize (M, cu, \&) so as to satisfy v[M] = 1 (Sec- 
tions [6], [7]) . There are two ways of such a normalization: 

• Put a := i^[M]'^ and consider {M,uj,Lp,'^) := (M, aw, (y?, i.e., u = e"-^+^'^" volg. 



• Take A as in Lemma 14.51 and consider (M, cj, cp, := (M, u, cp,'^ — \). 

In both cases it is easily seen that the conditions RIcat^ > and Hess > K are preserved. 
These two normalizations are equivalent when ip = ipi, where we indeed observe 

exp(— \&)a; = exp(— \& — / + In a)u, exp(— \l/)a; = exp(— \I' — / + A)a;, 

and hence A = Ina. 



5 (p-relative entropy and its displacement convexity 

In this section, we introduce a generalization of the relative entropy, that we call the 
if-relative entropy, associated with functions ip in an appropriate class. For the rela- 
tive entropy on a (unweighted) Riemannian manifold, it is known by von Renesse and 
Sturm |vRS] that its i^'-convexity in the Wasserstein space (P^(M), W2) (in the sense of 
Definition 14.11) is equivalent to the lower Ricci curvature bound Ric > K. Then it was 
shown by Lott and Villani |LV2j that Ric > K further implies a kind of convexity property 
of a class of entropies including the relative entropy. In this sense, the relative entropy is 
an extremal element in such a class of entropies. In the same spirit, our main theorem 
in the section (Theorem 15.71) asserts that the m-relative entropy induced from ip = ip^ 
(studied in |OTj . recall Subsection 12.41 as well) is an extremal one in an appropriate class 
of (/3-relative entropies. 
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5.1 Curvature-dimension condition 

We begin with a brief review of Lott, Sturm and Villani's curvature-dimension condition. 
To formulate it, we need to introduce the two functions often used in comparison theorems 
in Riemannian geometry. 

For K e M, N e (1, oo) and < r (< ny/{N - l)/K if K > 0), we consider the 
function 

( v^(iV^^T)7i?sin(rv/i?7(^V^^) if > 0, 
sx,7v(r) :=< r if K = 0, 

[ y^(iV^^T)7^sinh(r^^^^7(^V^^) H K < 0. 
This is the solution to the differential equation 

K 

with the initial conditions sk,n{0) = and s^^(O) = 1. For n G N with n > 2, s/^^„(r)"'~^ 
is proportional to the area of the sphere of radius r in the n-dimensional space form of 
constant sectional curvature K/{n — 1). Using Sx,n, we define 



SK,N(tr] 



N-l 



for K,N,r as above and t G (0,1). Now, we are ready to state Sturm's curvature- 
dimension condition characterizing lower Ricci curvature bounds, developed in |vRSj , |Stlj 
and |St3] in gradually general situations (see also |CMS1] . jCMS2] for related important 
work). Recall (12. 8p and (12. 6p for the definitions of the Renyi entropy Sn and the relative 
entropy Ent^^. 

Theorem 5.1 (Sturm's curvature-dimension condition) Let (M, w) be a weighted 
Riemannian manifold. We have Ric^v > K for some K G M and N G [n, oo) if and only 
if any pair of measures /io = Pq(jJ, /ii = PiOJ G V^^{M, u) satisfies 



SNiPt) < -(1 -t) / /3]^^K(x,y))%o(x)-^/^rf7r(x,|/) 



l/N 

MxM 

-t [ /3l^K(x,y))%i(y)-^/^rf7r(a;,y) (5.1) 

J MxM 

for all t G (0, 1), where {iXt)t&[o,i] C V^^{M,u) is the unique minimal geodesic from po to 
Pi, and TT is the unique optimal coupling of fiQ and fii. 

Similarly, RiCoo > K is equivalent to the K -convexity of Ent^, 

Ent^(^t) < (1 -t)Ent^(/io) +tEnt,(/iO - ^{1 - t)tW2{po, Pif- 
For K = 0, we find /3q ^ = 1 and (15. ip is nothing but the convexity of S^, 

SN{pt) < (1 - t)SN{po) + tSN{pi)- 
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For 7^ 0, however, (15.11) is not simply the K-convexity of S^. 

Lott and Villani's version of the curvature-dimension condition requires a similar con- 
vexity condition, but for all entropies induced from functions in VCp^ ( |LVlj . [LV2j . [Vi2t 
Part III]). For U G VCn and fi G ■p^(M), we denote hj fi = pu + fi'^ its Lebesgue decom- 
position into absolutely continuous and singular parts with respect to the base measure 
u, and define 



UM := lim / U{p) duj + t/'(oo)/i'[M], U'{oo) := lim 
=-^0 J{p>e} '■^"^ 



U{r) 



We set oo ■ = by convention, and remark that U'{oo) = hm^^oo U'{r) holds due to the 
convexity of U. 

Theorem 5.2 (Lott and Villani's version) We have RIctv > K for some K and 
N e [n, cxo] if and only if, given any pair of measures /io,yUi G V'^{M) decomposed as 
/ij = piU + fif {i = 0,1), there is a minimal geodesic {fit)te[o,i] C V'^{M) between them 
such that 

+ U'{oo){{l-t)pl[M]+tp\[M]] (5.2) 

holds for all U G VCm and t G (0,1), where (g V{M) jiQ-a.e. x) and iXy (g V{M) 
/ii-a.e. y) denote the disintegrations of n by /io and /xi, i.e., 

d7r{x,y) = d'K^{y)dpo{x) = d'Ky{x)dpi{y). 

Recall that PC^v' C PC at for n < N < N' (Lemma 13.31) . This agrees with the 
monotonicity RIcat < RIcat/ for n < N < N' . In the case where both /iq and /xi are 
absolutely continuous with respect to u, we find 

d7i{x,y) = po{x)d7i^{y)du{x) = pi{y)d'Ky{x)du{y) 

and (15. 2p is rewritten in the more symmetric form 



PK^Nidgix^y)) Po(x 
MxM Po(a;) \P]^l4idg{x,y)) 



+ t I P;}'^ ] dnix,y). 



MxM Pi 

Besides Riemannian manifolds, these two versions of the curvature-dimension condi- 
tion are equivalent for metric measure spaces where geodesies do not branch, such as 
Finsler manifolds and Alexandrov spaces. In other words, Sturm's version implies Lott 
and Villani's one. Roughly speaking, this implication can be seen by localizing Sturm's 
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(15. ip thanks to the non-branching property, and then integrating these local inequalities 
for each U G I^Cat yields ( I5.2p . The same infinitesimal estimate (Claim ES]) will appear 
in our discussion. Theorem 15.21 is extended to general Finsler manifolds by introducing 
the appropriate notion of the weighted Ricci curvature (see Section [10] and |0h2] ). 

General (not necessarily differentiable) metric measure spaces satisfying the condition 
in Theorem 15.11 or 15.21 are known to behave like Riemannian manifolds of Ric > K and 
dim < iV in geometric and analytic respects ([St2], [St3], [LVT] . jLV2] . |Vi2l Part III]). 
We shall generalize this technique to v?-relative entropies in the following sections. 



5.2 relative entropy H^p 

Let (M, w, V9, \i/) be an admissible space in the sense of Definition 14. 3[ We modify u^p as, 
for r > 0, 

MO :={"';''' , '!^ = °°' (5.3) 

yu^[r)-rL^ if L,^ < oo. 

We also define 

oo ii Lip = oo, 



if < oo. 



Note that G "DC^^ (by Theorem 13.51 thanks to the admissibility) immediately implies 
hy, G VCn^. Moreover, if < oo, then is non-increasing and hence nonpositive. We 

set 

L^(M, u):= {p: M — ^ R | measurable, h^{p) G L^(M, w)}, 
P*(M) := {/i G ViM) I * G L\Mlf,)} 

(we will use these notations only in Remark l5.4p . Now the Bregman divergence (12.21) leads 
us to the following generalization of the relative entropy. 

Definition 5.3 ((/^-relative entropy) Given fi G V{M), letting /i = + /i^ be its 

Lebesgue decomposition, we define the if -relative entropy of fi by 



(5.4) 



HM--= I {h^{p)-h'^{(r)p}dco- / h'^{a) dfi' + h'^{oo)fi'[M] 

M JM 

hMduJ- I h'^{<y)dpi + h'^{oo)pi'[M] 
Jm 

if h^{p) G L^{M,u) and h' (a) G L^{M,p), otherwise we set H^{p) := oo. 



Let us summarize several remarks on Definition 15.31 
Remark 5.4 (1) In the second term of (15. 4p . to be precise, we set 

I - if < oo 
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This causes no problem because M = M* if = — oo. The additional condition /i[M*] = 
1 (in other words, n G V{M^)) will be imposed only when we compare the behavior of \E' 
with that of (as in Theorems 15. 7^ W7\ and so forth). 

(2) We remark that the condition (A-H]) in the admissibility guarantees that a G 
L'^{M,u) as well as G L^{M^ , u). Thus we have H^lu) G M (by extending the defini- 
tion (15. 4p verbatim). 

(3) The validity of the definition of h'^{oo) (for the lower semi-continuity of H^, see 
Lemma 15. 6p would be understood by the following observation: For small e > 0, put 
yU.£ = := ijj[B{x,e)]~^XB{x,e)<-^, where XB{x,e) stands for the characteristic function of 
B{x,e). Then we have 

/ h^{ps)du = u[B{x,e)]-h^(-—^——.) ^ h'{oo) 

JB{x,e) K^i-^y^^^)]/ 

as e tends to zero. 

(4) Finally, as for the domain of H^p, it is more consistent to set H^{p) = oo only 
if h^{p) — ph'^{a) ^ L^{M,u!). However, as we will sometimes treat the internal energy 
J^j h^{p) duj and the potential energy Jj^j h'^{a) dp separately, we consider the smaller 
domain in Definition 15.31 This may cause a problem when considering the lower semi- 
continuity of H^, whereas we need it only for compact M (see Lemma [5.61 below) where 
h'^{cr) G L^{M, p) is always true (so that h^{p) G L^{M, u) if and only if h^{p) — ph'^{(r) G 
L\M,u)). 

Let us add a comment on the relation between p G L'^{M^ u) and p G P*(M). Assume 
< oo and p = puj + p'^ E P*(M). The nonpositivity and the convexity of hip yield 

h^{p)\du = - h^ip)du<- {hp{a) + h'^{a){p - a)} du 
M Jm Jm 



/ {h^{a) - h'^{a)a} dco - / ph'^{a) du < oo. 
Jm Jm 



Hence p G L^^^M^u) automatically holds. One can also see the converse implication 
(p G L'^lMjU) =^ p G P*(M)) for the special case ipm{s) = s"* with m > 1, where 
= oo as in (^1} {^T\ Remark 3.2(2)]). 

It is easily observed that z/ is a unique ground state of (provided that u[M] = 1). 

Lemma 5.5 Suppose z/[M] = 1. For any p = pu + p^ E V{M), we have H^{p) > H^{u) 
and equality holds if and only if p = u. 

Proof. We assume H^{p) < oo without loss of generality. Observe that 

= [ {h^ip)-h^ia)-h'^ia){p-a)}du- [ h'^ia) dp' + h'^{oo)p^[M]. (5.5) 
Jm Jm 

On the one hand, if //"[M] > 0, then the singular part 

- / h'^{a)dp' + h'^{oo)p'[M] 
Jm 
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in (15.51) is positive if < oo (recall that h'^{cr) = — < on M \ M^) or infinity 
if = oo. On the other hand, the strict convexity of implies that the absolutely 
continuous part 

/ iKip) - h^i^) - h'^io-){p- a)}du 

J M 

in (I5.5p is nonnegative and equality holds if and only if p = cr w-a.e.. Thus H^p{p) > H^p^u) 
and equality holds if and only if /i'^[M] = and p = a holds w-a.e.. □ 

Observe from (15. 5p that it holds D^{fi\i') = H^{fi) — H^{u) for any absolutely con- 
tinuous measure p with respect to u. Thus the Bregman divergence D^lplu) mea- 
sures the difference between the entropies at p and the ground state u. In |OTj . we 
have studied the specific function (frn{s) = and the associated m-relative entropy 

Hjn[p\u) for m G [(n — l)/n, 1) U (1, oo). In the present context, Hm{p\iy) coincides with 
H^Jp)-H^Ju). 

The following lemma will be used in Section [H] (Claim 18.81) to construct a discrete 
approximation of the gradient flow of H^, where M is assumed to be compact. 

Lemma 5.6 Let M be compact. Then the (p-relative entropy is lower semi-continuous 
with respect to the weak topology, that is to say, if a sequence {fii}i^n C V{M) weakly 
converges to p, E V{M), then we have 

H^{p) < liminf H^{pi). 
Proof. We divide H^p{p) into two parts as 

H^\p):= f h^{p)duj + h'^{oo)p^[Mi H!^\p):=-[ h'^{a) dp, 

J M JM 

where p = pu + p^. Note that ||/?-Jp(o') ||oo < oo thanks to the compactness of M (recall 
Remark 15.41 (1)). Then [p] is clearly continuous in p and the lower semi-continuity of 
H^^\p) follows from Theorem B.33]. □ 



5.3 Displacement convexity of 

In our previous work |OTj . we showed that the displacement i^-convexity of the m-relative 
entropy Hm{p\i^) = H^p^{p) — H^^ijj) with respect to /i G V'^{M) is equivalent to the 
combination of RIcat > and Hess \I' > K, where = 1/(1 — m) ( jOTl Theorem 4.1]). 
This characterization can be regarded as to correspond to Sturm's version of the curvature- 
dimension condition (15. ip (Hess ifml'l'^) ^ K and (15. ip actually coincide if is constant 
and K = In the reminder of the section, we shall consider the convexity of appropriate 
families of the <y9-relative entropies corresponding to Lott and Villani's version of the 
curvature- dimension condition (15.20 . Recall (12. 3p for the definition of 9^p. 

Theorem 5.7 (Displacement convexity of families of H^) Given A' G M, G M \ 
(— 1, n) and an admissible space (M, w, the following three conditions are mutually 

equivalent, where m = {N — 1)/N : 
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(A) We have Ricn > and Hess > K on M*^ . 

(B) For any /io,/ii G P^(M) with /io[M*^] = /ii[M*^] = 1 snc/i that any pair of points 
Xi G supp fiidM^^ {i = 0, 1) are joined by some minimal geodesic contained in M^^, 
there exists a minimal geodesic {^t)tG[o,i] C V^{M^J along which 

H^M < il-t)H^M+tH^M - y(l -t)tiy2(/io,/ii)' 
holds for all t e [0, 1]. 

(C) Take any (f with 6^ < 2 — m such that (M, is admissible. Then, for any 
/io,Ati G 'P^(M) with /io[M^] = /ii[M*] = 1 such that any pair of points Xi G 
supp /ij n M* (i = 0, 1) are joined by some minimal geodesic contained in M*, there 
exists a minimal geodesic {fit)te[o.i] C V^{M^) along which 

HM) < (1 - t)HM + tH^i^ii) - y (1 - t)tW2{^lo, f^if (5.6) 
holds for all t e [0, 1]. 

Proof. The equivalence between (A) and (B) has been estabhshed in |OTl Theorem 4.1]. 
As (C) =^ (B) is trivial (recall 9^^ = 2 — m, see Subsection I2.4p . it is enough to show (A) 
=^(C). 

We can assume that both H^{fio) and H^{fii) are finite, otherwise the assertion (15.61) 
is obvious. We first consider the case where both /iq and /ii are absolutely continuous 
with respect to u. By Theorem 12.6^ there exists an almost everywhere twice differentiable 
function : Q — y M with fio[Q] = 1 such that the map Tt{x) := exp^,(tV0(x)) {t G [0, 1]) 
gives the unique minimal geodesic fit ■= ('7t){j/io from /io to /ii. Given /xo-a.e. x, Ti{x) 
is not a cut point of x due to [CMSll Proposition 4.1], so that the geodesic (7f(x))tg[o,i] 
is unique and contained in M*. Put jjt = Pt^ and J^(x) := ef^^^~f'^'^^^^'^^(iei{DTt{x)). 
By the change of variables formula with the Jacobian equation [pt o Tt)3^ = po /iQ-a.e. 
(Theorem 12.71) . we deduce that 

H^'\pt) := / h^ipt)dLO= [ h^{ptirt))3r dcu 

J M J M 




where we set %l){r) := r^h^{r~'^). As Theorem 13.51 together with the monotonicity of 
VCjy in m (Lemma 13.31) ensures G VC^, the function ip{r) is non-increasing (resp. 
non-decreasing) and convex in r if > 1 (resp. A^ < 0) due to Lemma [3.21 

Then the essential ingredient is the concavity of N3f{xY^^ as in the next claim. We 
give a sketch of the proof for completeness (see |St3j or |0h2j for a detailed proof, where 
a more delicate estimate under Ric^v > K for general G M is discussed). 

Claim 5.8 Under RIcat > 0, N3^{xY^^ is concave in t for fiQ-a.e. x. 
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Proof. Take an orthonormal basis {ej}"^^ of T^M and extend each to the vector field 
Ei(t) := D(Tt)x{Gi) for t G [0, 1]. Note that every Ei is a Jacobi field along the geodesic 
7(t) := Tt{x), since 7f is a transport along geodesies. Let us consider the n x n matrix- 
valued functions A{t) = {cnjit)) and B{t) = ipijit)) given by 

n 

a,,(t) := {Ei{t),Ej{t)), D^Ei{t) = Y,bij{t)Ej{t), 



where is the covariant derivative along 7. Observe that Jt{x) = e^^^'^"^^^^'^'^^ <leiA{t). 
We see by calculations A' = 2BA and A" = — 2 Ric-y +2i?^A, where we set Ric^ : = 
{{R{Ei,'j)'j, Ej))^j^-^ and R stands for the Riemannian curvature tensor of {M,g). Com- 
bining these with the symmetry of -B, we have 



(detA)V-l . -tr(mc,A-) -tr(i?^)|^^i^^ < -^(detA)V- 



Put 

v{t) := J^(x)^/^, v,{t) := e^-«")-^(^W»/(^-"), Viit) := {detA(t)}i/2". 
As f = v['^~"^^^V2^^ , we obtain 

< -if ° 7)" + ^^w-^ - ^i^(^) = - ^i^^(^)- 

i\ — n 

Note that the range of G (— oo,0) U [n,oo) is essential here for making (A^ — n)/N 
nonnegative. Thus the assumption RIcat > implies Nv" < 0, so that Nv = NJ^(xy^^ 
is concave in t. (} 

Therefore we have, as Jq = 1, 

^((jr/po)'/^) < ^((1 - t)(i/Po)'/^ + t(jr/Po)^/^) 

< (l-t)V^((l/po)^/^) +t^((Jr/Po)'/^) (5.7) 

/iQ-a-e.. This implies 



^ IKS) )*° 



On the other hand, it follows from Hess > K that 
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and hence 

Hf\lit) ■■=- [ h'^{a)dfit < {l-t)H^^\^i,)+tHf\ii,) - ^{l-t)tW2{^io.^il?. 
Jai ^ 

(5.9) 

Combining (15. 8p with (15. 9p . we obtain the desired inequahty (15. 6p . 

Let us next consider the case where [Iq or /ii has nontrivial singular part. We may 
assume L^p < oo, otherwise (15. 6p trivially holds by the definition of h'^{oo). Decompose 
fiQ and fii as /iq = Po(^ + /^o A^i ~ P^^ + /^i' ^^^^ optimal coupling vr of /iq and 
Hi. Let pi,p2 : ^ X — > M denote the projections to the first and second components. 
Now, 71 is decomposed into four parts 

such that (pi)tt(7raa), (pi)t)(vras), (P2)tt(7raa) and (p2)tt(7rsa) are absolutely continuous and 
that (pi)tt(7r^a), (Pi)tt(7rss), (P2)tt(7raJ and (p2)tt(7rss) are singular or null measures. We 
divide optimal transport between /iq and /ii into two parts, corresponding to tt — tTss 
and Tiss- For /io := (pi)jj(7r — tt^s) and /ti := (p2)t)(7r — tt^^). Theorem 12.61 guarantees the 
existence of a geodesic 

ilte{i-iTss[M X m])-vUm,u), te (0,1), 

(i.e., fit[M] = 1 — 7iss[M X M]) such that /tf [M*] = /t([^]- Setting fit = ptu, we observe 
h^{pt) du < {1 - t) / h^{po)duj + t / h^{pi)duj, 



h'^{a) dfit < -{I - t) h'^{a)dfio-t h'^{a) dpi 
M Jai Jm 

K f 

-ir(l-^)M <^3(a;,l/)^c?(7r - 7r^J(x,?/). 
^ Jmxai 

To be precise, in the first inequality, we used /i^ < along the transports corresponding 
to Has and vr^a. By Proposition 12.51 we find a minimal geodesic 

pt = ptoo + ~pI e Tiss[M X M] ■ V^{M) 

from yUo := (pi)u(7rss) to /ti := (p2)tJ (tTss) realized through a family of geodesies in M*. 
Then the condition Hess ^ > K implies 

-/ h'^{(^) dpt < -{I - t) h'^{a)dpo-t h'^{a) dpi 
Jm Jm Jm 

K f 

-^0--t)tl dg{x,yY d7iss{x,y). 
^ Jaixai 

We put pt := pt + pt and conclude that 

H^iPt) = / h^{pt + pt)du- / h'^{a)dpt< / h^{pt) du - / h'^{a) d{pt + pt) 
Jm Jm Jm Jm 

< (1 - t)H^{po) + tH^ipi) - y (1 - t)tW2{po, Pi)\ 
where we used the fact that h^p is non-increasing (since < oo) in the second line. □ 
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Remark 5.9 Recall that M* = M*^ = M if = — oo by definition (and admissibility). 
Hence the condition in (B) and (C) that supp /iq and supp fii are connected in is 
nontrivial only if > — oo. Even when > — oo, Lemma [4.2( i) guarantees that M* is 
totally convex if Hess > K > 0. 

In the limit case of = oo (m = 1), we can follow the proof of (A) =^ (C) using 
ilj{r) = e^hi^{e~^) as well as the concavity of log(J^(x)). However, the implication (B) 
=^ (A) is not true. This is because the two weights / and \l/ are synchronized as z/ = 
g-/-* Yol^ and we can control only the behavior of / + (see the proof of (B) =^ (A) 
sketched in the next subsection). 

Instead, one can see from Theorem 15 . 2 1 that RiCoo > K (of (M, u)) implies the X{K, U)- 
convexity of U^j for all U G "DC oo-, where 

r K\imr^,{rU'{r)-U{r)}/r if A' > 0, 

X{K^U):=ir.iKrRS[l^^=\ if K = 0, 

r>0 r I 

[ K\mVr-,oo{rU'{r) - U{r)}/r if K < 

i pm Theorem 7.3], [Vi2l Theorem 30.5]). 



6 Functional inequalities 

If Ricjv_^ > and Hess^ > K for some > 0, then we can obtain variants of the 
Talagrand inequality, the HWI inequality, the logarithmic Sobolev inequality and the 
global Poincare inequality. These are derived from fundamental properties of convex 
functions along the lines of |0V] and jLV2t Section 6] (see also jOTt Section 5] where 
we studied the special case of the m- relative entropies). We will impose only the strictly 
weaker condition Yie^sH^ > K > (for single (f) in the Talagrand inequality for the 
use in the next section. 

For fj, = pu E Vl^{M,(jj) with /i[M*] = 1, we define the (p-relative Fisher information 
with respect to u = au by 



lM--= f |VK(p)-K(a)]|'d/i= / 
Jm Jai 



provided that it is well-defined, otherwise we set /(^(p) := oo. This quantity describes 
the directional derivatives of as follows. (At this point the treatment in [OTj was 
somewhat too rough, the argument in the present paper is correct.) 

Proposition 6.1 (Directional derivatives of H^f,) Let (M, cu, </:>, \E') be an admissible 
space with Ricm^ > and Hess > K on for some K EE., and fi = pu E V^^{M, u) 
be such that p[M*] = 1, H^{p) < oo, ph'^{p) - h^{p) G Hl^^iM) and that \Vp/Lp{p) \ + 
|V\&| G L^(M, p). Take a minimal geodesic (pt)tg[o,i] C P^(M) emanating from po = p 
generated by a locally semi-convex function cj) : M — )■ M. as pt = ('7i)t)/^ with Tt{x) = 
exp^(tV(/)(x)). If 0^ < 1, then we further suppose that supp po and supppi are compact. 
Then we have 

^^.^j HM-HM ^ f / Vp ^^^^^ \^^_ 

40 t Jm\^[p) I 
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Moreover, equality holds in (16. 2p {with linit^o place o/liminf^^o) ^ C'^(M). 

Proo/. We first deduce equality in (16 ■2p for G C^(M). Put /ij = ptU and := 
ef~f^'^^'>det{DTt) as in tlie proof of Theorem 15. 71 We follow the calculation in Theorem 15. 71 
and see 

H^{^^t) - H^if^) = l^l^K (^) Jr - Kip)] doo - jjh'^ia o %) - h'^{a)} d^i. 

By the convexity (15.71) of h^{p/3f )3f in t and Hess^' > K, we can apply the monotone 
convergence theorem and obtain 

^.^iw-HM^ f ,.^/v(p/jr)jr-/v(p)^ f 

40 t Jm 40 t J^j 

Note that, by using the weighted Laplacian A"^ introduced at the beginning of Subsec- 
tion 18.31 below, 

lim = hm g^"^^^'^det(Z}7;) - 1 ^ _ 

40 t 40 t \ \ J I 

= A0-(V0,V/)=A> 

Thus we have 

Therefore we conclude, by the integration by parts for A^ (since (f) G C^(M)), 

= f {v[h'^ip)p-h,{p)] + pv^,v<p)du 

40 I J j^j 



IM / 
In the case of ^ C^(M), we need to take care about the last step of integration by 
parts. If 6*^ > 1 (equivalently, e [n, oo] fl (2, oo]), then we can directly apply jVi2| 
Theorem 23.14] to see (16. 2p . For 6^ < 1, the same proof (Step 3 in [Vi2| Theorem 23.14]) 
still works provided that supp/io and supp/ii are compact. □ 

Remark 6.2 Let us add some more remarks on the case of 6*;^ < 1. A large part of the 
proof of [Vi2l Theorem 23.14] also works in this case (even without the approximation 
procedure based on [Vi2l Proposition 17.7]). Proposition 12. 12( i) ensures > — oo so that 
Uip{r) > lipT for all r > 0, and Lemma [2.91 shows 

' <(U — < 



ip{s) \tj ip{t) ip{t) 



for all < s < t, which corresponds to [Vi2t (23.52)] with A = oo (hence (23.53) and 
(23.54) are unnecessary). Note that p'{s) in [Vi2] is s/(f{s) in our context. The only 
problem is that s/(p{s) is never bounded for large s, as seen from the model case of 
s/ipm{s) = s"^~^ with m > 1. The boundedness is used to guarantee p'{p) G L^(M, //), so 
that we can assume p/(p{p) G L?{M,p) instead of the compactness of supp/io U supp/xi 
in Proposition 16.11 above. 
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We assume i/[M] = 1 by scaling (recall Remark 14. 6p . and prove functional inequalities 
associated with if^. 

Theorem 6.3 Let {M,u,ip, \E^) be admissible with v e 'Pf^(M,a;). We set := H^{u) 
for brevity. 

(i) (yj-Talagrand inequality) Suppose that Hessif^ > K for some K > 0. For any 

e P2(M), we have 

M^2(/i, ly) < ^j^{HM-Hu). (6.3) 

(ii) ((y9-HWI, (y9-logarithmic Sobolev inequalities) Assume RIcat^ > and Hess ^ > K 
on M* for some K > 0. Given fi = pu E Vl^{M) with /^[M*] = 1 such that 
H^[fi) < oo and that p is locally Lipschitz, we have 

HM -Hu< ■ ^2(/i, ^) - f Vr2(/i, z/)2, (6.4) 



(iii) (<y9-global Poincare inequality) Let {M,g) be compact and (p be , and assume 
Ricjv^ > and Hess^' > K on for some K > 0. Then for any Lipschitz 
function w : M* — y M such that Jj^j^ w du = 0, we have 



2 -11- / \ 2 



wa 
(/?((j) 



du. (6.6) 



Proof. We first remark that M* is totally convex if Hess > K > (see Lemma 14.21 
and Remark 15.91 as well). Thus M* is totally convex in (ii) and (iii). 

(i) There is nothing to prove if LL^p^p) = oo, so that we assume L[^{p) < oo. By the 
hypothesis HessiJ;^ > K, there is a minimal geodesic {fj,t)te[o,i] C P^(M) from p,Q = p, to 
p.1 = V such that 

< H^{^,) -H,<{1- t)H^{ij) - (1 - t)H, - f (1 - t)tW2{fi, uf (6.7) 

for all t G [0, 1]. Dividing both sides with 1—t and letting t go to 1, we obtain the desired 
inequality (16. 3p . 

(ii) As the case of L^{jj) = oo is trivial, we assume L^^p) < oo. For the minimal 
geodesic {pt)telo,i] from p,^ = p to p^i = z/. Theorem 12.61 ensures that p^t G Vl^.{M,uj) for 
all t G [0, 1] and there is a locally semi-convex function such that p^t = (7t)tj// with 
7f(x) = exp^(tV0(x)). Due to (16. 7p . we have 



Moreover, Proposition 16.11 shows that 



> f (VK(p)-K(a)],V0))d/i. 
40 t J ]^ 
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We remark that is bounded if < 1 (by Proposition I2.12f i) and Lemma [4.2( i)). so 
that Proposition 16 . 1 1 is certainly available. We obtain from the Cauchy-Schwarz inequality 
that 



M 



1/2 



M 



40 t 

where the last equality follows from 

\V(j){x)\ = rfg(x,exp^.(V0)) = dg{%{x),Ti{x)) /i-a.e. 



\V(j)\^dfi 



1/2 



X. 



Combining this with (16.81) . we obtain (16.41) . By completing the square, we deduce (16.51) 
from (16. 4p as 



/^(/i)-W-2(/X,i^)-|l^2(/i,i^)' = -f 



2K 



2K 



(iii) For small e > 0, we put fi = pu := (1 +ew)au. We remark that H^{ji) and /^(/i) 
are finite as M is compact. It follows from (16.51) that 



{u^{p) - u^{a) - u Ja){p - a)] du < 



2K 



|V[ln<^(p) - ln<^(cr)]|^ dp. 



On the one hand, we have by expansion 



(P 



cr 



-U 



+ O {{p ~ af) 



2(^(a) 



where 0{e^) is uniform on M (for fixed w) thanks to the compactness of M. On the other 
hand, it holds 



2M |2 



|VK(p) - ln,(a)]|^ = I V[(p - a) ln;(a) + 0((p - af)] 



ewa 



wa 



+ 0{e^). 



Thus we have, letting e go to zero 

dv = 



V 



dv. 



+ 0{e' 



□ 



The <y9-Talagrand inequality (16.31) is regarded as a comparison between the distance 
functions appearing in Wasserstein geometry and information geometry, since the square 
root of the Bregman divergence behaves like a distance function (see Subsection 12.31) . 
Note that, in the </?-global Poincare inequality (16. 6p . the usual global Poincare inequal- 
ity Jj^jiv'^du < Jj^\Vw\^ du is indeed recovered when ip{s) = ipi{s) = s. Other 
inequalities are also clearly reduced to the usual ones for (p = (pi. 
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7 Concentration of measures 



The aim of this section is to derive the concentration of measures from the <y9-Talagrand 
inequahty (16. 3p . Let us assume z/[M] = 1 (see Remark 14. 6 p and define the concentration 
function by 

a(r) = a(^M,u){r) := sup {l - u[B{A,r)] \ Ac M: measurable, u[A] > 1/2} 

for r > 0, where B{A,r) := G M | mirc^Adg{x,y) < r}. The function a describes how 
the probabihty measure u is concentrating on the neighborhood of an arbitrary set A of 
half the total measure in a quantitative way (in other words, a kind of large (or moderate) 
deviation principle). Equivalently, a measures how any 1-Lipschitz function is close to 
the constant function at its mean. We refer to the excellent book |Lej for an introduction 
to the concentration of measure phenomenon. 

In the classical case of (pi{s) = s, the Talagrand inequality (16. Sp implies the normal 
concentration a{r) < Cexp(— cr^) or equivalently a{r)^^ > C""*^ exp(cr^) with constants 
c,C > depending only on K (see \Le[ Section 6.1]). For general ip, we will similarly 
derive from (16.30 the m-normal concentration involving the m-exponential function Cm 
(see Subsection 12. 4p . Precisely, we have a{r) < Cem{—cr'^) with m = m{(p) < 2 — 9^, if 
9^ > 1, and a{ry^ > C'^e^icr'^) with m = m(</?) > 2 - ^.^ if < 1 (Theorem [73]). 

7.1 General estimate 

For each measurable set A C M with < //[A] < oo, denote the normalized restriction of 
on y4 by 

To analyze its entropy H^{ua), we introduce the function 

f/(e, t) := - ^ ln^(0, (e, t) e (0, oo) x (0, 1]. 

Note that H^^va) = fA^i'^^^l-^]) '^'^ t)e precise, we can set U{a,h'[A]) := on 
M \ M* thanks to the following lemma. 

Lemma 7.1 If 9^ < 2, then we have \im^ioU{C,,t) = for every t G (0, 1]. 
Proof. Note that 



U{i,t)= {ln^(s)-ln^(0}rfs= / —-drds 
Jo Jo Jf Vy) 







, drds + / — — drds = — — — dr -\ — / — — dr. 

Js Ji J( Jo nr) t ip{r) 

(7.1) 

Then (12. ip (deduced from Theorem 13.51) shows the claim. □ 
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Lemma 7.2 Assume 6^ < 2. For any measurable set A with z/[A] > 1/2, we have 
Hipi^UAj < 0. In particular, it holds H^i^u) < if u[M] = 1. 

Proof. For any ^ > 0, U{^,t) is non-increasing in t G (0, 1] since we have 

Similarly, U{C,, 1/2) is non-increasing in ^ due to 

Thus we deduce from Lemma [7.11 that, for any ^ > and t > 1/2, 

f/(e,t)<f/(e,i/2)<iim?7(e,i/2) = o, 

which shows H^piyA) = j\ U{a, i^lA]) du < 0. □ 

Next we give an estimate on H^p{vb) for 5 C M not necessarily y[B\ > 1/2. Recall 
fl2.4p for the definition of 6^. 

Lemma 7.3 Assume 6^ < oo and ||cr||oo < oo. Given any measurable set B G M with 
< iy[B] < oo and any > max{z/[i?], ||cr||oo}, we have 

H^M < -iy[B]'--Hn^{i^[B])^',--'- [ a^~'-du. (7.2) 

Jb 

Proof. For t,^ ^ i^j^o], we deduce from (17. ip that 

where we changed the variables as r = sC,t~^. Note that Lemma 12.91 shows that for all 



< ^ — ^ < 



ip{sit-^) - ifis^ot-') - ^(s) V t 
Thus we find 

U{^,t)<-t'--Hn^mo-~'-e-'^- 

This implies 

H^i'yB)= [ Uia,u[B])dcu<-u[B]'--Hn^iu[B])^',-~'^- [ a'~'- du 
Jb Jb 

as desired. □ 
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We remark that, if 5^^ < 2, then we have at any s G (0, 1) 

^[.^-Mn,(.)] = .^-3 I (5, - 2)ln,(s) + ^} > 0- 

Therefore the right hand side of fl7.2|) is non-increasing in v{B\ provided that z/ is a 
probabihty measure. 

Now we show a general estimate of a(r) under the strict convexity of H^p. 

Proposition 7.4 Assume that (M, w, (yj, \[^) is admissible, v G Vg,c{M,u), HessiJ^ > K 
for some K > and that ||o"||oo < oo. We set := H^piy) (< 0) as in Theorem 16.31 
Then, for any C,o > max{l/2, ||cr||oo} and any r > with a{r) > 0, we have 

a{rf--HYi^{a{r))^l^-'- -snv j^a^-'^du <-(^^r-y^^ - H,, (7.3) 

where A G M runs over all measurable sets of ulA] > 1/2 and we set B := M\ B{A^r). 

Proof. Since a(r) < 1 by definition, the left hand side of (17. 3p is always nonpositive. 
Therefore the assertion is clear if < —%HyjK. Suppose > —%H^,jK^ take a measur- 
able set A C M with v\A\ > 1/2 and put B := M \ B{A,r). We also assume u[B] > 
since we have «(r) = if z/[i?] = for all such A. 

Observe that WiI^a, vb) > r as dg{x,y) > r for all x G A and y E B. Note also that 
Wi < W2 holds by the Cauchy-Schwarz inequality. Then the triangle inequality for Wi 
and the (yj-Talagrand inequality (16. 3p yield 



Applying Lemma [72] gives, as r > —8H,y/K ensures ^/K/2r > \/—H„, 



Since A is arbitrary and i^lB] < 1/2 < ,^0; combining the above estimate with Lemma 17^ 
yields 

- -H,> sup !^u[BY--HnMBmy'^ 

> a{rY^-Hn^{a{r))^Q^~^^ ■ sup / a^"^^ du. 

A Jb 



□ 



37 



7.2 Concentration of measures 

We shall obtain the concentration of {{M,u,ip,'ifi)}i^^, i.e., linij^oo «(M,j/i)(^) = for all 
r > with Ui := exp^{—'^i)uj, under an appropriate condition on the convexity of 
associated with ^'j. We first prove an auxiliary lemma. 

Lemma 7.5 Assume that (M, a;, <yC, \E') is admissible, v G Va_c{M,uj) and that ||cr||oo < oo. 
Set := H^{u) and take arbitrary > ||o"||oo- 

(i) If6^<l, then we have 



Jm 



> 



(ii) If E (1,3/2) and u[M] < oo, then we have 



I a^-^^ du < uj[M]^^-\ > 
Jm 



(2-^X^0)' 
Proof. It follows from (17. ip and Lemma 12.91 that 



= I Uia, l)du>- I I ^— drdu >- [ [ ^° , drdu 

fir) JmJo fi^o) 



M JMJO 

duj. 







ipi^o) Jm 2 - 9^ 

(i) The assertion immediately follows from 

[ a'~'-du<\\a\\l^'- [ adu<il-'\ 
Jm Jm 

(ii) The Holder inequality yields that 

! a^-''^ du < ( [ adu] ^ u[MY^-^ = u[MY^-\ 
Jm \ Jm J 

which shows the claim. □ 

Theorem 7.6 Let {{M,u,ip,'^i)}i^^ be a sequence of admissible spaces such that 

(a) uj[M] < oo if 6^ > I, 

(b) Ui = aiu := exp^(-^i)w G V^c{M,uj) for all i, 

(c) := max{l, ||cri||oo} < oo for all i, 

(d) Hess-ff^ > Ki for some Ki > 0, where if^ is the (f -relative entropy for (M, w, \l/j), 

(e) limi^oo Ki^i"''^ = oo if 6^ < 1, and linij^oo A'^^^"'"^'^ = oo if 6^ > 1. 
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Then the concentration function aj(r) := a(A/,j/i)(^) satisfies liiiij^oo = for all 

r > 0. 

Proof. Fix r > and put := ai(r) and H^- := Hl^{vi) for brevity. It follows from 
Proposition 17.41 that 

( / [17- \ 2 ^ / r X -1 



A/ 



We have \n^{ai) > l2-e^{,0ii) by (12.9p . so that 

< -ef-^- (l^r^ - v^v^^) (/^ ' • (7-4) 

Now, for 6^ Lemma 17751 (1) and Lemma [2]9] yield 



6 ^ 



1-5. 



since > 1. Hence the right hand side of (17.41) is bounded from above by (for large i) 



as i goes to infinity due to the condition ^ . Therefore we obtain 

lim a:r'^i2-e^{ai) < hm af"'"^£2-e^(ai) = -oo, 

and hence limj_^oo = 0. 

For 6',^ > 1, we similarly deduce from Lemma [7.5( ii) that 



Hence the right hand side of (17.41) is bounded from above by 



r —7- — oo U — !■ oo . 



Thus we have limj_j,oo = 0. □ 

Remark 7.7 (1) For with m < 1, we have 6^^ = S^p,^ and hence the condition (jej) is 
reduced to limj_j,oo Ki = 00, whereas the condition ||cri||oo < 00 was implicitly used in our 
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discussion. See |OTt Section 6] for a more precise estimate associated with without 
assuming ||o"j||oo < c>o. 

(2) We stress that only Hess if^ > Ki for single ^ is assumed in Theorem 17.61 rather 
than RicAT^ > and Hess\E'j > Ki. If Hess\E'j > Ki and l^p > — oo, then Lemma [4.2r i) 
gives a stronger estimate on the diameter of M** as 



diamM^' < 2^-^ {In^dldilU) - Q < 2^-^ {In^fe) - l^}. 
Indeed, we observe from (12.91) and Lemma [2.91 that 



K, - K.ifi^,) - Ki K,{l~e^) 

provided that 9^ 1. If 6',^ < 1, then the leading term (as — > oo) is 

^0 (^ oo) 



1-9^ K, 



under the condition (jej) in Theorem 17.61 Similarly, for 9^> 1 the leading term is 

^0 (i^ oo). 



9^-1 Ki 



Therefore, in both cases, limi_j,oo diam M*' = holds and it is obviously stronger than 
limi^ooa(M,;.i)('") = 0- 

7.3 m((^)-normal concentration 

In order to derive the m-normal concentration for some m = m{ip) from the general 
estimate (17. 3p . we prove a computational lemma on (see also [OTl Lemma 6.4]). 
Recall from Subsection 12.41 that emir) = exp (r) = [1 + (m — l)r 



l/(m-l) 



Lemma 7.8 (i) Given < m < m' < 1 with m + m' > 1, set 

2 — jji^ 

/3 = /3{m, m') = l + G (1, 2]. 

1 — m 

Then we have /3m' > 1 and, for any a,r > 0, 

" - +—)^ e„(/3)e„ (- (l - ^ \ 

V m' ) ml ) \ \ pm! J m + m' — 1 



(ii) For any m G [1,2) and a, r > 0, we have 

em{{ar - if - l) > e„ e„ (^yr^^ 
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Proof, (i) The assumptions m' < 1 and m + m' > 1 yields (m + m')(l — m') > (1 — m') 
and hence 

^ , m'|2 — (m + m')| (m' + m)(l — m') + m' — m 1 — m' -\- m' — m 
(3m = — — — = > : = 1. 



1 — m 

From the direct calculation 



1 — m 



1 — m 



ar 



2 2 

-a r 



2ar 



2^2 



a r 



'm' I rri' 
and the monotonicity of Cm, we deduce that 



+ ^ < -a'r' + -^ + (3 



ar 



m' 



1 + (m - 1) 



/3m' 



pm' / 

-1 l/(m.-l) 



+ /3 



{1 + (m - J 1 - (m - 1) f 1 



1 



f3m' J m + m' — 1 

(ii) The assertion for m = 1 (with ei(r) = e^) is easily checked. For m G (1,2), we 
deduce from 



(ar — 1)^ — 1 = a^r^ — 2ar > a^r^ 



2 2 I 

a r + 



2„2 



1 a r 

2 / 



that 



^((ar- 1)2-1) > 



m 



1 1 2 2 

1 — \ a. r 

m 



m 



l + (m-l)<i (l--)aV-- 



1 - (m- 1) — 
m 



2 

l/(m-l) 



1 l/(m-l) 



677), I 



m 



ma 



m — 1 
2 

2 

m 



1 2 
ma r 



l/(m-l) 



□ 



Theorem 7.9 (m(9?)-normal concentration) Assume that (M, w, \&) is admissible, 
V G Pac(^5i^); Hessif,^ > K for some K > and that \\(j\\oo < 00. Fix arbitrary 
^0 > max{l, ||cr||oo}. 



(i) If 6^ < 1 and 6^ > 0, then we have for any r > 



a(ry^ > 



(l-5^)(2-5^) 



1 ^ c^^-l 2 

62-5, I j^o r 
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(ii) If 0^ G (1,3/2), 6^p > 3{9^ — 1) and if ijj[M] < oo, then we have for any r > 



a{r) < 



26^ — 6^ — 1 



(iii) If 9^p = \ and 5^p > 1/2, then we have for any r > 

Proof. We abbreviate a(r) as a in this proof, and assume a > without loss of generahty. 
Let A C M be a measurable set of iy[A] > 1/2 and put B := M \ B{A,r). 
(i) We first observe 

a^-^^ duj < \\a\\]^^^ I a duo < ail"'^" . 

B Jb 

Then (Q yields 

a'^-Hn,{a) < ^o'^-^j - (^^r - ^T^j - i/.}, (7.5) 
where Hy := H^{u). On the one hand, it follows from (12 .Qp that 

a'--Hn^{a) > a'--H^^eM = • 

Since a"^^"^^ > 1 > (1 - 0^)/{l - ^^), we obtain 

On the other hand. Lemmas I7.5l (i) and 12.91 give 

Hence we have 



^2^sJCa~')>{\l^^r-l) -1. 



We apply Lemma I7.8( ii) and obtain 



(l-5^)(2-5^) 
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(ii) We deduce from the Holder inequality that 

2 Q 



Then (I7.3p gives 



a'--'- In^(a) < ^'.--'^uiM]'-'^ | - (^^r - ' - . 

Set m := 2(1 — 0^) + and m' := 2 — ^<^, and observe < m < m' < 1 as well as 
m + m' > 1. Similarly to (i), fl2.9p yields 



a''^ \nja) > a 



2-0. a 



As a™-™' > 1 > (1 - m')/(l - m), we find 



a-^^-^^ In^(a) > = C(ca), c := ^ ) < 1. 

m — I \1 — m' 

Lemmas I7.5( ii) and 12.91 imply 



{2 - e^M^o) - 2 - ml' 
and hence 



Then we apply Lemma I7.8( i) to have, with /3 = (2 — m — m') /(I — m) 

a < c~^e„(/3)e„ - 



I V /5"^7 2(m + m'-l) i 



(iii) It immediately follows from (17. 5p and (17. 6p that 



a^-^ K(a) < Co'-^ I - (y|r - ^ " ^ " ( V 

Note that (12. 9p provides a^-^'^ In^(a) > a*^-^"^ ln(a). If = 1, then it holds a^'^~^ ln(a) = 
— ln(a~^). Otherwise, the numerical estimate 

ln(t) > — for t e (0, 1], s > 0, 

shows a^"^'^ ln(a) > — £3_25^(a~^) (let 5 = 1 — 5^ and t = a). Therefore we have, thanks 
to Lemma [7.8l (ii) with m = 3 — 26^ < 2, 



□ 
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Note that 3(6',^ — 1) < 9^ in (ii) by 6^^ < 3/2, so that the condition 6^ > 3{9^ — 1) is 
not vacuous. 

Remark 7.10 Letting 5^ = and then 6'^ — i- 1, all of the estimates (i)-(iii) in Theo- 
rem [TSltend to the normal concentration a(r) < exp(— 

8 Gradient flow of H^^i Compact case 

In this and the next sections, we show that the gradient flow of the (/9-relative entropy 
produces weak solutions to the nonlinear evolution equation 

on the weighted Riemannian manifold (M, w). See the beginning of Subsection 18.31 for 
more explanation and background. This kind of interpretation of evolution equations has 
turned out extremely useful after the pioneering work due to Jordan et al. [JKOj . There 
are several ways of interpreting this coincidence. In this section, we adapt the rather 
'metric geometric' approach developed in |Qhl] inspired by |PP] and |Ly| (see also |Pe]). 
This formulation of gradient flows requires a strong structure theorem (Theorem 18.11) of 
the Wasserstein space, which is known only for compact spaces. The noncompact situation 
will be treated in the next section in a different strategy along |AGS1] and [Erj . 

Before beginning the review of the structure of Wasserstein spaces, let us recall basic 
notions of calculus on our weighted Riemannian manifold {M,u) with u = volg. For 
a differentiable vector field V on M, the weighted divergence is defined as 

dw^V := dwV -{V,Vf), 

where divV^ denotes the usual divergence of V for the unweighted space (M, vol^). Note 
that div^ V = e-^ dw{e~-^V) and, for any w G C^(M), the integration by parts holds: 

/ (Vw, V)duj= / {Vw, e'^V) dvolg = - w div(e"^V^) civol^ = - w div^ V du. 
Jm J m J m J m 

Through this formula, the weighted divergence is defined in the weak sense also for mea- 
surable vector fields. For p G Hl^^{M), the weighted Laplacian is defined in the weak form 
by 

A> := div^(Vp) = Ap - (Vp, V/). 
8.1 Geometric structure of (P(M), W2) 

Let M be compact throughout the section, so that V{M) = V^{M). It is known that 
(V{M), W2) is an Alexandrov space of nonnegative curvature if and only if (M, g) has the 
nonnegative sectional curvature ( [St21 Proposition 2.10], |LV2| Theorem A. 8]). Alexan- 
drov spaces are metric spaces whose sectional curvature is bounded from below by a 
constant in the sense of the triangle comparison property, and such spaces are known 
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to possess nice infinitesimal structures (we refer to |BBIj for the basic theory). We re- 
mark that it is in most cases impossible to bound the curvature of V{M) from above 
(cf. [AGSlt Example 7.3.3]). In the case where {M,g) is not nonnegatively curved, al- 
though (P(M), W2) does not admit any lower curvature bound in the sense of Alexandrov 
( |St2l Proposition 2.10]), we can consider the 'angle' between geodesies (see also jOhH 
Theorem 3.6]). 

Theorem 8.1 ( |Gi2t Theorem 3.4, Remark 3.5]) For any ^ G V{M) and unit speed 
geodesies a, /3 : [0, 6) — > V{M) with a{0) = /3(0) = fi, the joint limit 

s,m 2st 

exists. 

Theorem 18.11 in particular guarantees that the scaling limit 



lim 



[se 



)^ + {teY-W2{a{se),/3{te)Y 



eio 2ste^ 

exists, and is independent of the choices of the parameters s,t > 0. This means that 
an angle between a and (3 makes sense, so that {V{M),W2) looks like a Riemannian 
space (rather than a Finsler space). This observation makes it possible to investigate the 
infinitesimal structure of {V{M), W2) in the manner of the theory of Alexandrov spaces. 
For II G V{M), denote by S|^[P(M)] the set of all nontrivial unit speed minimal geodesies 
emanating from /x. Given Q!,/3 G T,'^[V{M)], Theorem 18 . 1 1 verifies that the angle 

/i m Ix ^' + ^'-^2(«(3),/3(t))^ ^ , 
Z^(q;,P) := arccos lim G [0,7rJ 

y s,t\S) 2st J 

is well-defined. We define the space of directions (S^j[P(Af)], Z^) as the completion of 
(S|j[P(M)]/~, Z^), where a ^ (3 holds if Zf^{a,f3) = 0. The angle Z^ provides a natu- 
ral distance structure of E^['P(M)]. The tangent cone {C^[V{M)],a^) is defined as the 
Euchdean cone over (S^['P(M)], Z^), i.e., 

C,[ViM)] := ( S^[P(M)] X [O,oo))/(E^[P (M)] x {0}), 
a^{{a,s),{P,t)) := sjs"^ + t'^ - 2st cos Z^(a, /3). 

By means of this infinitesimal structure, we introduce a class of 'differentiable curves'. 

Definition 8.2 (Right differentiability) We say that a curve ^ : [0,/) — > V{M) is 
right differentiable at t G [0,/) if there is v G C^(j)['P(M)] such that, for any sequences 
{£i}i6M of positive numbers tending to zero and {ajjjgN of unit speed minimal geodesies 
from^(t) to^(t+£i), the sequence {(a,, iy2(^(t), ^(t+£i))M)}iGN C Q(t)[P(M)] converges 
to V. Such V is clearly unique if it exists, and then we write ^{t) = v. 
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8.2 Gradient flows in {V{M),W2) 

Consider a lower semi-continuous function H : V{M) — )■ (—00, +00] which is /T-convex 
in the weak sense for some G M. We in addition suppose that H is not identically +00, 
and define V*h{M) := {/i e V{M) \ < 00}. 
Given e Vh[m) and a E S^[P(M)], we set 

D,H(o):= liminf ItaMM^. 

E;[-p(M)]9/3-^a 40 t 

where the convergence /3 — a is with respect to Z^. Define the absolute gradient (also 
called the /oca/ slope) of if at /i G V}j{M) by 

|V_if|(/i) :=max|o,limsup^M^^l, 

where /i — /i is with respect to W2. Note that —D^H{a) < \V-H\{fi) for any a G 
^^[^(M)]. The i^-convexity of H guarantees the unique existence of the direction along 
which H decreases the most. 

8.3 f piTl Lemma 4.2]) For each fi G P^(M) with < |V_i/|(yu) < 00, there 
exists a unique direction a G T,^[P{M)] satisfying D^H{a) = —\V-H\{fi) . 

Using a in the above lemma, we define the negative gradient vector of H at fi by 

V.H{fi) := («, \V-H\ifi)) G C,[V{M)]. 

If |X7_ii|(/i) = 0, then we simply define \LH{fi) as the origin of C^[V{M)]. A trajectory 
of the gradient flow of H (which will be called a gradient curve) should be understood as 
a curve ^ solving C,{t) = \LH{^{t)). Precisely, we adopt the following definition. 

Definition 8.4 (Gradient curves) A continuous curve '■ [0,/) — > Vlj{M) which is 
locally Lipschitz on (0,/) is called a gradient curve of H if \W-H\{^{t)) < 00 for all 
t G (0,/) and if it is right differentiate with ^{t) = \LH{C{t)) at all t G (0,/). We say 
that a gradient curve is complete if it is defined on entire [0, 00). 

By virtue of the i^-convexity of H as well as the compactness of M, there starts 
a unique gradient curve from an arbitrary initial point fi G V^{M) enjoying the K- 
contraction property as follows. 

Theorem 8.5 f fOhTl Theorem 5.11, Corollary 6.3], [GUI Theorem 4.2]) Let M be com- 
pact and H : V{M) — y (—00, +00] be a K -convex function for some K G M. 

(i) From any ji G V^{M), there exists a unique complete gradient curve C, '■ [0, 00) — )■ 
V*h{M) ofH with^{0) =/i. 

(ii) (Ji'-contraction property) Given any two gradient curves ^, ( : [0, 00) 
H, we have 

for all t G [0, 00). 



V*h[M) of 
(8.1) 
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The uniqueness in (i) is indeed a consequence of the i^-contraction property (18.11) . 
Thus the gradient flow G : [0, oo) x P^(M) — > Vh{M) of H, given as G(t,/i) = i{t) 
for ^ in Theorem 18. 5( i). is uniquely determined and continuously extended to the closure 
G : [0, oo) X V*h{M) — > V*h{M). 

8.3 Hip and the (^-heat equation 

It is an established fact that the gradient flow of the relative entropy (or the free energy) 
with respect to u, 

Ent^{pu) = / p\npduj= / (pe~-^) ln(pe~-^) dvolg + / /(ip, 
Jm Jm Jm 

produces solutions to the associated heat equation (or the Fokker-Planck equation) 

|^A>^e/{A(pe-/).div((pe-')V/)}, 

See [JKOl Theorem 5.1], |Vilt Subsection 8.4.2] for the Euclidean case, [Ohll Theo- 
rem 6.6], \G0\ Theorem 4.6], jVi2t Corollary 23.23] for the Riemannian case, jOSlt Sec- 
tion 7] for the Finsler case, and |FSS] . pu] . |GKO] . |Ma] . |AGS2] for further related work 
on various kinds of spaces. 

We shall see that a similar argumentation gives weak solutions to the equation 

as the gradient flow of the relative entropy H^p. We will call (18. 2p the ip-heat equation. 
In the special case of (pm{s) = (18. 2 p is called the fast diffusion equation (for m < 1) 

or the porous medium equation (for m > 1). Then this identification was demonstrated 
by Otto [Ot] on (M",£"), and by [Vil Theorem 23.19] as well as [UT] on weighted 
Riemannian manifolds (by the different means). We can follow the strategy of |0T] for 
general ip, up to some technical difficulties. 

We first observe \V-H^\{p,) = ^/Tjji) as Proposition 16.11 suggests. 

Proposition 8.6 Let (M, w, \E') he a compact admissible space such that RIctv^ > and 
Hess^ > K for some fsT e M. Take p = pw G Pac(M,w) with p[M*] = 1, if^(p) < oo, 
ph'^ip) - Kip) ^ H^{M) and with \Vp/^{p)\ E L'^{M,fi). Then we have \\LH^\{fx) = 
^Jl^p{pi), and the negative gradient vector V-H^{p) is given by —Vp/(f{p) — V^. 

Proof. Given any pi G V{M) with if^(pi) < oo, let (pf)tg[o,i] C V{M) be a minimal 
geodesic from po = p to pi along which is /^-convex (Theorem 15. 7p . Letting pt = 
{Tt)^p with lt{x) = exp^(tV0(x)), we deduce from the J^-convexity of that 

g,(p,) - HM ^ ^^^^^^ _ ^^^^^ _ 
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Combining this with Proposition I6.H we have 

^ < T / ( + V^', V0 ) rf/i - — 1^2 /i, ^l) 



< Jim - irW^2(/i,/Ui). 



Thus we obtain |V_if<^|(/i) < ^Jl^{ij), and equahty follows also from Proposition 16.11 by 
choosing {0j}jgN C C°°(M) which approximates — ln<^(p) + ln;p(cr) in H^{M,fi). Then, 
moreover, V-H^{fj,) is achieved by —Vp/ip{p) — V\l/ (to be precise, {{fit)te[o,i],W2{p, pi)) 
associated with 0j converges to \LH^{fi) in C^[P{M)]). □ 

Now we are ready to show the main theorem of the section. 

Theorem 8.7 (Gradient flow of H^p) Let (M, w, (/?, be a compact admissible space 
such that RicAT^ > and Hess\l/ > K on for some K G M. We in addition assume 
that 6^ G (0, (n + l)/n), limg^oo s^"^ / ^{s) < oo and that ^ is Lipschitz. If a curve 
(A*t)tG[o,oo) C Va_c{M,u) with fit[M^] = 1 is a gradient curve of H^, then its density 
function pt is a weak solution to the if -heat equation fl8.2p . To be precise, pt is weakly 
differentiable as well as \Vpt/ip{pt)\ € L'^{M,pt) a.e. t, and we have 

for all < to < ti < oo and w G C°°(]R x M), where pt = Pt^ o-nd wt = w{t, ■) . 

Proof. First of all, the weak differentiability of pt and pt/'f{pt)\ G L'^i^ipt) follow 
from (I) ^ (II) of Proposition 19.61 below. Fix t G (0, oo) and, given small 5 > 0, choose 
p^ G V{M) as a minimizer of the function 

p ^ HM + ^ • (8-4) 

We postpone the proof of the following technical claim until the end of the section. The 
condition 9^ < {n + l)/n will come into play in (i), while 9^ > and lims_j.oo s^^ /ip{s) < oo 
will be used in (iii). 

Claim 8.8 (i) Such a minimizer p^ of (18. 4p indeed exists and is absolutely continuous 
with respect to u. 



(ii) We have 

lim = 0, limi7^(/) = H^ip 

In particular, p^ converges to pt weakly. 



(iii) Moreover, by putting p = p lo, h^{p ) — h'^{p )p converges to h^{pt) — h'^{pt)pt 
L^{M,u) as S 10. 



in 
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Take a semi-convex function : M — y M such that T(x) := exp^(V0(a;)) gives the 
optimal transport from jj to Ht (recall Theorem I2.6p . We also consider the transport 

:= {J^s)f^^^ in another direction for small e > 0, where J-'e(x) := exp^{eVwt{x)). It 
immediately follows from the choice of /i*^ that 



Ham + ^ > H^ifi ) + — . (8.5) 



We first estimate the difference of the Wasserstein distances. Observe that, as (J-'^ x T)^iJ 
is a (not necessarily optimal) coupling of /xf and 

lim sup 

£4,0 ^ 

<limsup-/ {dg{T,{x),r{x)Y -dg{x,rix)Y]dfi\x) = - [ 2{Vwt,V<j)) d^i^ 

elO £ J M J M 

We used the first variation formula for the Riemannian distance function dg in the last line 
(cf., e.g., |Chvt Theorem II.4.1]). Thanks to the compactness of M, there is a constant 
C > (depending only on (M, g) and w) such that 

wt{r{x)) < wt{x) + {Vwt{x),V<i){x))+Cdg{x,T{x)f 



for a.e. x G M. Thus we obtain, by virtue of Claim iH^ ii). 

hm mt — hm sup < — hm sup - / ( Vwt, V0) djj, 

&io 2d £ (5;o <J Ja/ 



< lim inf - 



[ {wt-wt{r)}dfi' + cw2{fi',fitf 

J M 



lim inf - < / Wi di/ — i Wt dui 

540 6 \ Jm 



IM 

Next we calculate the difference of the entropies in (18. 5p . We put fi^ = p^u, /if = p^u 
and := e-^~-^*-'^^Met(DJ-'e). Then we obtain from Proposition 16.11 that . as Wt E C°°{M), 



lim ^^(/^) H^(^'e) ^ I [^h'y)pS _ /,^(/)}A-w;, + (/VK(a)], Vi/;*)] da; 

eiO e J j^i 

(we need the conditions RIcat,^ > and Hess ^ > K only here for applying Proposi- 
tion ETT]). Hence we deduce that, together with Claim IHISl fii). (iii). 



limlim ^^(a^ ) ^^(/^J = I (p,)p, - h^{p,)}N^wt - (piV^, V^i)] rfa; 
5\a eio e J 

= - [ {V[h'{pt)pt-K{pt)]+pN^,Vwt)duj = -f (^^ + \/^,Vwt)dfit 
These together imply 

liminf^l / Wtdp^ - [ WtdpA > - [ ( ^^/^ + V"^ ^VwA dpf 



S{Jm Jm J JM\^{Pt) 
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Moreover, equality holds since we can change w into —w. Recall from |GUl (5)] (see also 
[OhTl Lemma 6.4]) that 



d [ Jm Jm 
holds for all rj G C°°(M). Therefore we conclude 

lim ^ I I Wt+5 dfit+5 - I Wt dfit 
hi JM 



54.0 5 
hn. ^ 

54,0 



lim ^ <j / {wt+5 - Wt) d^t+5 + I Wt dfit+5 - I Wt dfit 
M Jm Jm 



This shows (18.31) by integration in t. □ 

Remark 8.9 In Theorem 18.71 assuming that fit is absolutely continuous is in fact redun- 
dant. If = oo, then H^p{nt) < oo guarantees /it G Vac{M,uj) by definition. As for 
< oo, if /if with t > has a nontrivial singular part fi^, then we can modify fit as in 
the proof of Claim IHISl fi) below (with fi^ = fit and vr = diag^ fit where diag(a;) := {x,x)) 
and obtain fir G Vac{M,u}) for small r > such that 

W2{ilr,fit)' < /i1M]r^ ^^ H^if^r) - H^if^t) ^ 

r4,o r 

This yields {V-H^Kfit) = oo and contradicts the definition of gradient curves (compare 
this discussion with jAGSlt Theorem 10.4.8]). 

Combining Theorems 15. 7[ 18. 5[ 18. 7[ we obtain the following. 

Corollary 8.10 Let (M, \E') be an admissible space as in Theorem \S.7\ and further 
suppose that M* is totally convex. Then the weak solution (yUt)tg[o,oo) C Va_c{M^,uj) to 
the ip-heat equation constructed in Theorem \S.7\ satisfies the K -contraction property (18.11) . 



8.4 Proof of Claim ISTSI 

(i) The existence follows from the compactness of V{M) and the lower semi- continuity of 
(Lemma 15.61) . The absolute continuity is obvious if = oo. 

Assume < oo, so that 9^ G (1, (n -|- l)/n) and = {6^ — 1)^^ G (ra, oo) (Proposi- 
tion l2.12l (ii)). We decompose fi^ into absolutely continuous and singular parts f/ = pu+fi^ 
and suppose //'^[M] > 0. For small r > 0, we modify ji^ into fir = pr-oj G Va_c{,M,uj) as 

Prix) :=p(x)+ / ^^f^p^dfi^iy). 
Jm (^[B{y,r)\ 
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We shall show that fir gives a better choice than in our approximation scheme (18.41) . 
which is a contradiction and hence /i'^[M] = 0. We first observe 



JM JM JM ^[^\y-,^)\ J B{y,r) 

> [ /i;(a)V-|sup|V(/i;oa)|-r|/i^[M]. 



.6) 



Note that, on M*, h'^(a) = — \& — is Lipschitz since \1/ is Lipschitz. Given an optimal 
coupling TT = TTi + of /i^ and fit such that (pi)tj7ri = pu and (pi)Bvr2 = /i*, 

diTrix, z) := d-Kiix, z) + [ '^fiff^^ N^i duj(x) d7i2(y,z) 

is a coupling of fir and /xj. Hence we find 



W2{ftr,f^tf < dg 

Jmxm 



{x,zy d7Ti{x,z) + / {dg{y,z) +rYdn2{y,z) 



MxM 



< / dg{x,zfd7i{x,z) + {2di&mM + r}r7i2[MxM] 

J MxM 

< iy2(/,/ii)^ + {3diamM-r}/i'[M]. 
Next, observe that 



h^{Pr)dUJ= I , . rj.. 



As is convex, Jensen's inequality shows 



p{x) XB{y,r){x) 



dfi'iy) 



< 



/i^[M] 



Since /i,^ is non- increasing, we deduce from the Fubini theorem that 

h^{pr) du < / ( / + / ( !t\^^\. ) du } dp'{y) 

f^[M\JM[JM\B[y,r] J B(y,r) \(^[B{y,r)\' 



M 



< [ h^{p)duj-^—- I ( I h^{p) du] d^\y) 

Jm f^W\JM\JB(y,r) J 

+ sup <^ uj[B{y, r) ■ —- -r 

By virtue of the compactness of M, there are constants < Ci < C2 such that 

Cir'^ < u[B{y,r)] < Car" 
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for all y G M and small r > 0. Hence we have, as is non-increasing and nonpositive, 



sup { uj[B{y,r)\ ■ 

yeM 



fx'[M] 
u[B{y,r)] 



We find, by the monotonicity of In^, Lemma [2.91 and A'^^ 



limsupr /i<^(r" 

r\0 



lim sup \ r 



LP — y^if 



Car" 



ln<^(s) ds-r ^L^ 



POO 

< lim sup \r~^ \n^p{r~^'^) — r^^L^\ = — liminf / — — 

< — lim / ds = lim ^r-^ = — < 



ds 



rio (1 - 9^)r 1 - 9^ 



Hence we obtain, since n < < oo. 



— 7- — OO 



as r J, (here we need the hypothesis 9^ < (n + l)/n). Finally, for all y G supp/x*, the 
convexity of yields 



/ h^{p)duj> / {h^{(j) + h'^{(j){p - a)} duj 

JB{y,r) JB{y,r) 

= {h^{a) - h'^{a)a} du + h'^{a) dp. 

JB(y,r) JB(y,r) 



We therefore obtain 
1 



h^{pr) duj - I h,p{p) du 



M 

< -- inf 

r yeM 



M 



{h^ia) - h' (a)(T} du + h' (a) dp 



B{y,r) 



B{y,r) 



— — oo 

as r J, 0. Combining this with fl8.6p and (18 .yp . we conclude that 



lim-<j Hjpr) + 



W2{Pr,Pt] 

26 



- H^{p' 



W2{p\p t^^ 

2(5 



-oo. 



This contradicts the choice of p^ as a minimizer of (18 ■4p . so that it holds /i**[M] = 0. 
(ii) By the choice of p^ , we have 



. , W2{p\pt? 
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Together with H^{p^) > H^[u) (Lemma I5.5p . we immediately observe 



\imW2{p\ptY < \nn26{H^ipt) - H^} = 0- 
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Thus jJ converges to Ht weakly, and hence 

limsup ^'^^y*^' < H^{^^t) - hminf < 

5^0 ^0 54.0 

by the lower semi-continuity of H^^ (Lemma 15. 6p . These further yield 
H^ijJ^t) < liminf i?<^(/i'') < limsup if<^(/i'^) < H^{jj,t). 

(iii) This is a consequence of the following lemma. 
Lemma 8.11 Assume that 6^ G (0,2) and 

Cu, := lim — — < 00. 

If a sequence {/iijjgN C Va,c{M,u) converges to fi E Va,c{M,u) weakly and satisfies 
liuii^oo H^{fii) = H^p^fi) < 00, then, by setting fii = piU and fi = pu, the function 
Ki.Pi) ~ Pih'^iPi) converges to h^{p) - ph'^{p) in L^{M,uj). 

Proof. We first show the following claim by using O^p <2. 

Claim 8.12 For any C > 0, it holds 

lim II min{p, C} - min{pi, C}\\l2(m,uj) = 0. 

Proof. Assume the contrary, that is, there are some constants C,e > such that, taking 
a subsequence of {pijieN if necessary, we have 

II min{p,C} - min{pi,C}||L2(A/^^) > e (8.8) 

for all i. Now, since h'^{s) = (f{s)~^ is positive and non-increasing, we find 

h { < ^^pjp) + ^<p(pi ) \P-Pi\^ 



We shall further deduce from 6^ < 2 that 



8max{v?(p),(/?(pi)}' 



P-Pi? ^ |min{p,C} -min{p^,C}p 



max{ip{p),(p{pi)} ip{C) 
This is clear if max{p, pi} < C or min{p, p,} > C. Otherwise, (18. 9 p is reduced to 

and to the monotonicity of the function s H- (s — e)'^/(f{s) for s > e. This monotonicity 
is easily seen by Lemma [2. 9 j since 6^ < 2 and 

(s — S^'f o a f s — e^'^ 



ip{s) ip{s) 
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Thus we obtain from the hypothesis (18. 8p that 



M 



P + Pi 



du < 



< 



h^{p) + h^{pi) 



M 



J^^W min{p, C] - min{p„ C}||i2(A./,<^) 



M 



h^{p)duj + - / h^{pi)duj- 



M 



However, as hmj_^oo H^{fii) = H^{fi) by assumption, this means that pi := {(p + pi)/2}u 
satisfies 

1 o 



Mm snp H^{pi} < H^{ii) 



This contradicts the lower semi-continuity of iJ^ (Lemma 15. 6p and we complete the proof 



of Claim IHI3 
Observe that 







hip{r) — rh' (r) = / {ln(s) — ln(r)} ds 



Js 



dtds 



^(i) 



dt. 



Combining this with Lemma 12. 9^ we have for any r,s > 

t 



\h^{r) - rh'^{r) - h^{s) - sh'^{s) 



dt 



t^"^^ dt 



2-e, 



where we set m = 2 — 6';^ > 0. Thus we deduce that 



Kipi) - piK^pi) - Kip) - pK(p)\ - 

We are done if the right hand side tends to zero as i — )■ oo. 



\PT-P"'\duj. 



110) 



Claim 8.13 For m = 2 - 9^ e (0,2) , we have 

p,pi e L"\M,u), lim \\pi - p\\L"^(M,i,) = 0. 

Proof. The first assertion is clear when m < 1. For m > 1, it is a consequence of 
h^{p)ih^{pi) G L^(M, w) (guaranteed by H^{p), H^{pi) < oo). Indeed, by Lemma [2^91 
and (12.91) . we have on {x E M \ p{x) > C} for any C > 



u^{p) - u^{C) 



hi^{s) ds > 



c 



(s) ds 



c 



p™ - - m(p - C) 
m{m — 1) 



which implies max{p, C} G U^{M,uj) since m — 1 > 0, u^{p) — u^{C) > and u^{p) G 
L^{M,u). Thus we obtain p G L"^{M,u) and pi G L'^{M,u) similarly. We remark that, 
as \imi^^H^{pi) = H^{p) by assumption, we have limi^oo u^{pi) doo = Jj^jU^{p)duj 
so that Jj^ p^ doj is uniformly bounded in i. 

As for the second estimate, thanks to Claim [STT^ and m < 2, it suffices to show that 
Pi — minjpi, C} converges to p — min{p, C} in L'"(M, u) for some (arbitrarily fixed) C > 0. 
Note first that 



|(pi - min{pi,C}) - (p - min{p, C})| = | max{pj,C} - max{p, C}|. 
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We put pf := max{pj,C} and := max{p, C} for brevity. By the same argumentation 
as Claim Wn\ limi^oo H^{fii) = H^{fi) yields 

lim [ — ^ — -—- duj = 0. 

i^oo Jj^j max{v9(pi), Lp{p)\ 

Since ip is positive and non-decreasing, it holds 



M 



max{y?(pi), v?(p)} Jm f{pf) + f{p^) 



It follows from the Holder inequality that 



Observe that 

We deduce from Lemma 12.91 that 



^(^pC)rn/e^ + (^(p^)™/^^ for m < 1, 

2™/e^-i|(^(pP)'»/e^ + (^(p^)™/^*'} for m > 1. 



Since Jj^j^pf)"^ du is uniformly bounded in i, we find 

lim sup f {ifipf) + ip{p^)r^'^ dio < oo, 

and hence lim^^oo \\p? - p'^\\l"^{m,uj) = 0. 
Now we obtain, for m < 1, 

[ |p™-p™|da;< [ \p,-prdco^O (z^oo) 

with the help of Claim 18.131 Similarly, it holds for m > 1 that 

pT - p'^lduj <m I \pi- p\ max{pi, p}™~^ duo 

/ r \ / p \ (m-l)/m 

<m[ I Ipi-pl^'duj] ( / {pi + p)"'dLo\ ^0 (i^oo). 

□ 



A// / \JM 



We remark that, in Lemma fS . 1 1 1 and hence in Theorem 18. 71 the assumptions 6^ G (0, 2) 
and < oo can be replaced with 



(5<p e (0, 2), := lim — - < oo, d^ := lim — - > 0. 
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Indeed, then we have 

for all s > 0, and (IS.lOp becomes 

/ \KiPi) - P^K^P^) - Kip) - pK^p)\ < I \PT-P""\d^ 

for m := 2 — 6^. With this m G (0, 2), Claim |8T3] follows from Proposition 12 . 1 31 and (p{s) < 
s^'^/d^p (Claim [8.121 is unnecessary in this case since we can treat p and themselves 
instead of and pp). 

Note that C<^ = = c?^ = 1 < 00 for ipm{s) = s^~™. For 



^/s for < s < 1, 
s for s > 1, 



we have 9^ = 1, 6^ = 1/2, C^, = = 1 and = 0. An example of ip with = 00 is 



^/s for < s < 1, 
(p{s) ■= {s for 1 < s < 2, 
^ for s > 2, 



for which 6';^ = 1, 5^ = 1/2, = 1 and d^ = l/\/2. 



9 Gradient flow of H^: Noncompact case 

We continue the study of gradient flows in the Wasserstein space (P^(M), W2)- For non- 
compact M, we can not follow the intrinsic argument in Subsection 18. II since Theorem 18. II 
is unavailable. We can nevertheless introduce a Riemannian structure of 'P^(M) using 
the underlying Riemannian structure of M. Then gradient flows in P^(M) are also for- 
mulated with the help of the underlying Riemannian/differentiable structure of M. In 
order to see that the analogue of Theorem 18.71 holds true, we follow the argumentation 
in [AGSlj . [Eg and IVi2l Chapter 23]. We refer to [AGSl] for the further deep theory of 
gradient flows. 



9.1 Riemannian structure of {V'^{M),W2) 

Recall that minimal geodesies in V'^{M) emanating from absolutely continuous measures 
are described by the gradient vector fields of appropriate functions (Theorem 12. 6p . This 
leads the following definitions due to Otto |Qtj of the tangent spaces and the Riemannian 
structure. 

Definition 9.1 (Otto's Riemannian structure) We set 

f P := {$ = V0 I G C~(M)} 
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and define the tangent space (T^V^, (■, ■)/^) of P^(M) at /i G P^(M) as the completion of 
TV with respect to the norm || ■ ||^ induced from the inner product 

($i,$2)^:= / (<l>i,$2)c//i, ^1,^2 e TV. 

J M 

Note that (■, ■) is extended to the whole space T^P^ as the limit, and (T^V^, (■, ■)^) is 
a Hilbert space. We next introduce the class of 'differentiable curves' in a purely metric 
way (cf. lAGSll Section 1.1]). 

Definition 9.2 (Absolutely continuous curves) For p E [l,oo], a curve {f^t)tei C 
■p^(M) on an open interval / C M is said to be p-ahsolutely continuous if there is some 
?7 G L\^ci^) such that 

W2{^ls,^^t) < [ Vir)dr (9.1) 

J s 

holds for all s,t E I with s < t. 

Note that p-absolutely continuous curves are continuous. We will consider only 2- 
absolutely continuous curves, so that we simply call them absolutely continuous curves. 
For any absolutely continuous curve {fit)tei C V^{M), the metric derivative 

M ■= 1™— — 

s^t \t — S\ 

exists for a.e. t E I, and rjit) = \fit\ is a minimal function satisfying (19. ip (cf. |AGSlt 
Theorem 1.1.2]). We can associate a one-parameter family of vector fields on M with an 
absolutely continuous curve in V^{M) via the continuity equation on M. 

Proposition 9.3 ( jAGSH Theorem 8.3], [Er[ Proposition 2.5]) Given an absolutely con- 
tinuous curve {fJ^t)tei C 'P^(M), there exists a Borel vector field ^ : I x M — > TM [with 
$f(x) := $(t,a;) G T^M) satisfying $t G T^j'P^ for a.e. t E I as well as the continuity 
equation 

^^*+div(<|.,/Xi) = 



dt 

in the weak sense that 



^""^ ■ {<i>t,Vwt) } d^tdt = (9.2) 



iJm I dt 

holds for all w G C^{I x M). Such a vector field $ {satisfying G T^tV'^ and (lOj) ) is 
uniquely determined up to a difference on a null measure set with respect to dfitdt, and 
we have \\^t\\tit = lAtl foi^ ^-S- t E I. 

Conversely, if a curve {^t)tei C 'P^(M) admits a Borel vector field ^ : I x M — > TM 
satisfying (19. 2p and J^^ < oo for all t^.ti E I with to < ti, then {^t)t&i is 

absolutely continuous and \fit\ < ll'^'tlUt t E I. 

Definition 9.4 (Tangent vector fields) We say that the vector field $ as in Proposi- 
tion [93] is the tangent vector field of the absolutely continuous curve {nt)t<£i, and write 
fit = $f (for a.e. t E I). 
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It is guaranteed by the following Benamou-Brenier formula ( |BBj ) that Otto's Rie- 
mannian structure is compatible with the Vr2-structure, 

1/2 



W"2(/io,/^i) = inf ( [ Wf^tWldt] 

(Mt)te[o,i] \Jo J 



for any /io,/^i G P^(M), where the infimum is taken over all absolutely continuous curves 
(/if)i6[o,i] C V'^{M) from /iq to /ii. 

9.2 Gradient flow of E.^p 

Using the Riemannian structure of V'^{M) in the previous subsection, we can formulate 
gradient curves (trajectories of gradient flow) in a way different from the previous section. 
We first define gradient vectors. 

Definition 9.5 (Gradient vectors) Given a functional H : 'P^(M) — > (—00,00] and 
/i e Vl^{M) with i?(/i) < 00, we say that H is differentiable at 11 if there is $ G T^V^ 
such that 

H(iit)-H(ii) f , 

limsup — < / ($, V0) rf/i 

40 t Jai 

along all minimal geodesies {fit)telo,i] C ■p^(M) with /Xq = /x, where /^t = (7t)u/i with 
7t(x) = exp^(tV0(a;)), and if equality holds for e C^(M) (with lim^o in place of 
limsupj^o)- Such $ is unique if it exists, so that we will write VwH{fi) = $. 

Note that \V-{—H)\{fi) < || Vvy-f/^(/i)||^ holds by the Cauchy-Schwarz inequality. A 
gradient curve of the (y9-relative entropy should be understood as a solution to fit = 
Vw[~H^]{fit). Compare the next proposition with Proposition 18. 6[ 

Proposition 9.6 Let (M, w, (/?, \i/) be admissible, assume RIctv^, > and Hess\i/ > K 

on M* for some K E {K > if M is noncompact and 6^^, < 1). Fix ^ = puj E 
V^^{M,u) with /i[M*] = 1, H^{p) < 00 and with |V^| G L^{M,p). Then the following 
are equivalent: 

(I) |V_i/^|(/i)<oo, 
(II) p e HUM) and 



holds p-a.e. for some $ G T^V^ . 
Moreover, then we have $ = Vw[~H^]{p) and ||$||^t = \W^H^\{p) . 

Proof. (I) ^ (II): Note that, by the calculation (before the integration by parts) in the 
proof of Proposition 16. 

/ [{h'MP - Kip)} div. V - (pV^, V)] dcu 
Jai 

40 i W2{p,pt) t j 
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for all vector fields V of compact support, where we put fit = (Tt)^fJ. with 7t(x) = 
exp^(tV{x)). Hence the hypothesis (I) together with \1/ e H^^^^M) ensures that the 
function h'^{p)p — h^{p) is weakly differentiable. Since the function s i — )■ h'^{s)s — h^{s) 
is differentiable and increasing in s > 0, this implies p G Hl^^{M), and we observe 

Moreover, the above estimate shows that the function 

rp3V0 ^ ^^(V[/i;(p)p- V(p)]+pVvl>,V0)rfa; = ^^{|^ + VvI/,V0^rf/i 

is extended to a bounded linear operator on the closure T^V^. Therefore the Riesz repre- 
sentation theorem shows that there exists $ G T,,V^ with 



(9.3) 



for all S G T^V"^. Thus we have Vp/ip{p) + V^' = -$ p-a.e.. 

(II) =^ (I): We remark that the condition K > for O^p < 1 makes Proposition 16.11 
applicable. Thus we obtain 

hmsup ^^^^^"^'"^^*^ < / ($,V</.)dp (9.4) 
40 t Jm 

along every minimal geodesic {pt)te[o,i] C P^(Af) with po = P, where pt = i7t)tP and 
7t(a;) = exp^,(tV0(x)), and equality holds if G C^{M). Hence \\LH^p\{p) < oo follows 
from the hypothesis $ G T^V^, and we find $ = Vvy[— i?(^](p) in the sense of Defini- 
tion We have ||<l>||^ < \V-H^\{p) by and \V-H^\{p) < ||$||^ by ([131), so that 
II^IU = |V-i/^|(p) holds. □ 

Now, we are ready to show the main result of the section. We remark that the roles 
of the conditions RIcat^ > and Hess > K are implicit at this stage, whereas they were 
necessary for applying Proposition 16.11 

Theorem 9.7 (Gradient flow of H^) Suppose that (M, cu, \I^) is admissible and sat- 
isfies Ricjv_^ > as well as Hess\l/ > K on M* for some G M {K > if M is 
noncompact and O^p < 1). Let {pt)t&[o,oD) C V'^^{M,u) be a continuous curve such that 
Pt[M^] = 1, H^{pt) < oo and |V\1/| G L'^{M,pt) for all t > 0. Then {pt)te{o,oo) is an 
absolutely continuous curve satisfying 

Pt = Vw[-H^]{pt) eT^.V^ 

at a.e. t & (0, oo) if and only if {pt)te[o.oo) is a weak solution to the if -heat equation (18.21) 
with Jj*^ \V pt/ f{pt)\'^ dptdt < oo for all < to < < co, where pt = ptto. 
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Proof. Suppose fit = Vw[-H^]{fit) a.e. t. Since \\LH^\{fit) < \\Vw[-H^]{fit)\\^,t < oo 
by definition, Proposition 19.61 yields 

( Vft 



Then it follows from the continuity equation (19.21) that 

dwt 



M 



dt 



diitdt= f [ ( ^^fi- + V'^,Vwt) dfitdt 
Jo JhiX^KPt) I 



for all w G C^°°((0, oo) x M). Therefore pt weakly solves (!8l2!) . 

Conversely, if pi is a weak solution to (18. 2p with J^*^ \ V pt/ '•p{pt)\'^ dptdt < oo, then the 
same calculation implies that 



satisfies the continuity equation (19.21) . and hence {pt)t€{o,oc) is absolutely continuous by 
Proposition [231 As Proposition 16 . 1 1 guarantees \\LH^\{pt) < ll'^tlUt < oo a.e. t (by (19. 4p ). 
Proposition 19.61 shows $f = Vw[^H^]{pt) ^ T^^V"^ and then the uniqueness of a solution 
to the continuity equation (Proposition 19. 3p yields pt = = ^w[—H^]{pt) SL.e. t. □ 



9.3 Remarks on construction and contraction 

We can construct the gradient flow of H,^ along the line of [Erl Section 5], provided that 
M is compact. Precisely, we need the compactness for applying Lemma [5. 6 [ Claim IHTST i) 
and Lemma [8.111 As for the contractivity (see (18.11) ). the usual technique starts from the 
first variation formula for the distance W2{pj,Pt) between two gradient curves (see, e.g., 
[Er[ Proposition 4.4]). To follow this line, however, we need (at least) the C^-regularity 
of the density functions p]. The authors do not know if such a regularity can be expected 
for our (nonlinear, scale- variant) y^-heat equation (18. 2p . 

Another recipe (for construction as well as contraction) would be to apply the general 
theory of Savare |Sa] . Under Hessif^ > K and the additional semi- concavity condition of 
the squared distance function (which is always true for compact Riemannian manifolds), 
one can construct a unique gradient flow of enjoying the ii'-contractivity (18. ip . How- 
ever, we should take care about the point that his (metric) definition of gradient flows is 
different from the one discussed in Theorem 19.71 Thus, in particular, the existence of a 
gradient flow in our sense does not follow from Savare's result. 

We also mention another interesting contribution due to Gigli jGil] . he showed the 
unique existence of the gradient flow of the relative entropy in a quite general situation 
without relying on the contractivity. As mentioned at the end of |Gilj . his technique uses 
some special properties of the generating function M(^i(s) = s logs — s and is not applicable 
to all v^'s in our consideration (e.g., ipm for m < 1 is excluded). 
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10 Finsler case 



Most results in this article are extended to Finsler manifolds according to the theory of 
Ricci curvature developed in jOh2] . jOSl] (see also a survey |0h3] ). A Finsler manifold 
is a differentiable manifold equipped with a (Minkowski) norm on each tangent space. 
Restricting these norms to those coming from inner products, we have the family of 
Riemannian manifolds as a subclass. We refer to |BCSj and |Sh] for the basics of Finsler 
geometry. 



10.1 Finsler manifolds 

Let M be a connected, n- dimensional C°°-manifold without boundary. Given a local 
coordinate (x*)"^ 
of TU such that 



coordinate (x*)"^;^ on an open set U C M, we will always use the coordinate (x*, v-^)"^-^-^ 



e T,M, xeU. 



Definition 10.1 (Finsler structures) We say that a nonnegative function F : TM — > 
[0, oo) is a -Finsler structure of M if the following three conditions hold: 

(1) (Regularity) F is C°° on TM \ 0, where C TM stands for the zero section. 

(2) [Positive 1 -homogeneity) It holds -F(cv) = cF{y) for all v G TM and c > 0. 

(3) [Strong convexity) The n x n symmetric matrix 



«J=1 



is positive-definite for all v G T^M \ 0. 
We call such a pair (M, F) a C°° -Finsler manifold. 

In other words, F provides a C°°-Minkowski norm (see Example 110.2( a) below) on each 
tangent space T^M which varies smoothly also in the horizontal direction. For x,y E M, 
we define the distance from x to ?/ in a natural way by dF[x,y) := ini^ F{^[t)^ dt, 
where the infimum is taken over all C^-curves 7 : [0, 1] — > M such that 7(0) = x and 
7(1) = y. Note that dp is not necessarily symmetric, namely dF[y,x) 7^ dF[x,y) can 
happen, since F is only positively homogeneous. A C°°-curve 7 on M is called a geodesic 
if it is locally distance minimizing and has a constant speed (i.e., -^(7) is constant). We 
remark that 1 1 — y 7(1 —t) may not be a geodesic. Given v G T^M, if there is a geodesic 
7 : [0, 1] — > M with 7(0) = v, then we define the exponential map by exp^(v) := 7(1). 
We say that (M, F) is forward complete if the exponential map is defined on whole TM. 
Then the Hopf-Rinow theorem ensures that any pair of points is connected by a minimal 
geodesic (cf. |BGSl Theorem 6.6.1]). 



61 



We define the ii'-convexity of a function \1/ : M — )■ M in tlie weak sense similarly to 
the case of symmetric distances (Definition 14. ip . i.e., for any x,y & M there is a minimal 
geodesic 7 : [0, 1] — > M from x to y such that 

^(tW) < (1 - t)^{x) + t^{y) - f (1 - t)tdF{x, yf 

for all t e [0,1]. 

For each v G T^M \ 0, the positive-definite matrix {gij{'v))ij=i in (110. ip induces the 
Riemannian structure of T^M via 



t = l j = l ' 2J = 1 



(10.2) 



This is regarded as the best Riemannian approximation of F\t^m in the direction v. In 
fact, the unit sphere of is tangent to that of -F|t^m at v/F(v) up to the second order. 
In particular, we have g^iy^w) = F(y)^. 

Let us denote by £* : T*M — y TM the Legendre transform. Precisely, C* is sending 
a e T*M to the unique element v G Tj;M such that a(v) = F*(a)^ and F(v) = F*(a), 
where F* stands for the dual norm of F. Note that C*\t*m is a linear operator only 
when F\j-^M comes from an inner product. For a differentiable function p : M — y M, the 
gradient vector of p at x is defined as the Legendre transform of the derivative of p, 

Vp(x) ■.= C*{Dp{x)) eT^M. 

If Dp{x) = 0, then clearly Vp(x) = 0. If Dp{x) 7^ 0, then we can write in coordinates 

where ((7*-') stands for the inverse matrix of {gij)- We must be careful when Dp{x) = 
0, because gij(Vp{x)) is not defined as well as the Legendre transform C* being only 
continuous at the zero section. We also remark that the gradient V is a nonlinear operator 
(i.e., V(pi + P2){x) 7^ Vpi(x) + Vp2{x) and V(— p)(a;) 7^ — Vp(a;) in general), since the 
Legendre transform is nonlinear unless F happens to be Riemannian. 

We mention some of basic examples of non-Riemannian Finsler manifolds. 

Example 10.2 (a) (Minkowski spaces) A Minkowski norm | ■ | on M" is a nonnegative 
function on satisfying the conditions in Definition 110. 1[ Note that the unit ball of 
I ■ I is a strictly convex (but not necessarily symmetric to the origin) domain containing 
the origin in its interior. A Minkowski norm induces a Finsler structure in a natural way 
through the identification between Tj.]R" and M". Then (M", | ■ |) has the flat flag curvature 
(the flag curvature is a generalization of the sectional curvature). 

(b) (Randers spaces) A Randers space (M, F) is a special kind of Finsler manifold 



given by Fiw) = a/5'(v, v) + /3(v) for some Riemannian metric g and a one-form /3, where 
we suppose |/3(v)p < g{y,w) unless v = 0, for F being positive on TM \ 0. Randers 
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spaces are important in applications and reasonable for concrete calculations. Sometimes 
/3 is regarded as the effect of wind blowing on the Riemannian manifold {M,g). 

(c) (Hilbert geometry) Let D C M" be a bounded open set with smooth boundary 
such that its closure D is strictly convex. Then the associated Hilbert distance function 
is defined by 



for distinct xi,X2 G D, where | ■ | is the standard Euclidean norm and x'i,x'2 are in- 
tersections of dD and the line passing through xi,X2 such that a:- is on the side of Xj. 
Hilbert geometry is known to be realized by a Finsler structure with constant negative 
flag curvature, and gives the Klein model of hyperbolic space if D is an ellipsoid. 

(d) (Teichmiiller space) Teichmiiller metric on Teichmiiller space is arguably one of 
the most famous Finsler structures in differential geometry. It is known to be complete, 
while, e.g., the Weil-Petersson metric is incomplete and Riemannian. 

10.2 Weighted Ricci curvature and nonlinear Laplacian 

Different from the Riemannian situation, one can not choose a unique canonical measure 
on a Finsler manifold. There are several constructive measures, such as the Busemann- 
Hausdorff measure and the Holmes-Thompson measure, which are canonical in their own 
ways (see, e.g., |AT] ) . Thus we will fix an arbitrary positive C°°-measure a; on M as our 
base measure, like the theory of weighted Riemannian manifolds. 

The Ricci curvature (as the trace of the flag curvature) on a Finsler manifold is de- 
fined by using the Chern connection (there are other connections but the flag and Ricci 
curvatures are in fact independent of the choice of connection). Instead of giving a precise 
definition in coordinates, here we explain a useful interpretation due to Shen |Sht §6.2]. 
Given a unit vector v e Tj.M fl F~^(l), we extend it to a C°°- vector field V on a. neigh- 
borhood of X in such a way that every integral curve of V is geodesic, and consider the 
Riemannian structure gy induced from (110. 2p . Then the Ricci curvature Ric(v) of v with 
respect to F coincides with the Ricci curvature of v with respect to gy (in particular, it 
is independent of the choice oiV). 

Inspired by the above interpretation of the Ricci curvature as well as the theory of 
weighted Riemannian manifolds, the weighted Ricci curvature for (M, F, u) was introduced 
in |0h2j as follows. 

Definition 10.3 (Weighted Ricci curvature) Given a unit vector v G T^M, let 7 : 
{—s,e) — y M be the geodesic such that 7(0) = v. We decompose u as u = yo\^ 
along 7, where vol^ is the volume form of g^. Define 





(2) Ric^(v) := Ric(v) + (/ o 



7)"(0) 



(/°7)W 



for e (-00,0) U {n,oo), 



N -n 



(3) Ricoo(v) :=Ric(v) + (/o 



7)"(0). 
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For c > 0, we set Ric7v(cv) := Ric7v(v). 

It is established in \0h.2\ Theorem 1.2] that, for ii' e M and N G [?7,,oo], the bound 
Ric7v(v) > KF{yY is equivalent to the curvature-dimension condition CD{K,N) (note 
that (M, dp) is non-branching and thus Sturm's and Lott-Villani's conditions are equiv- 
alent). This extends the corresponding result on weighted Riemannian manifolds (The- 
orems [5]T1 15. 2p . There are further applications of RIcat beyond the curvature- dimension 
condition, e.g., a Bochner-type formula and gradient estimates ( jOS3] ). 

Remark 10.4 For a Riemannian manifold (M, vol^) endowed with the Riemannian 
volume measure, clearly we have / = and hence RIctv = Ric for all A^. It is also known 
that, for Finsler manifolds of Berwald type, the Busemann-Hausdorff measure satisfies 
(/ o 7)' = (in other words, Shen's S-curvature vanishes, see |Sh| §7.3]). In general, 
however, there may not exist any measure u of vanishing S-curvature, see |0h4] for such 
an example. This means that, on a general Finsler manifold, there is no measure as good 
as the Riemannian volume measure. This is a reason why we began with an arbitrary 
measure u. 

Define the divergence of a different iable vector field on M with respect to the base 
measure u by 



where we decompose u in coordinates as duj = dx^dx^ ■ ■ ■ dx^. Similarly to the Rie- 
mannian case, this can be rewritten (and extended to weakly differentiable vector fields) 
in the weak form as 



for all w G C^(M). Then we define the corresponding Laplacian of p G Hl^^{M) by 
A'^p := diV(^(Vp) in the distributional sense that 



for w G C^(M). We remark that Hl^^[M) is defined solely in terms of the differentiable 
structure of M. It is established in |0S1] and jOS3] that this nonlinear Laplacian works 
quite well with the weighted Ricci curvature. 

For later convenience, we introduce the following notations. 

Definition 10.5 (Reverse Finsler structure) Define the reverse Finsler structure ^ 
of F by ^(v) := F(-v). We will put arrows ^ on those quantities associated with F, 



for example, dp{x,y) = dpiVyX), Vp = — V(— p) and RicAr(v) = Ric7v(— v). 

10.3 Displacement convexity of H^^ and applications 

From now on, we consider only compact Finsler manifolds for simplicity. We remark that 
all compact Finsler manifolds are forward complete. 
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Let us consider an admissible space (M, u, (p, in the sense of Definition |4]3] similarly 
to the Riemannian case. Then the analogue of Theorem 15.71 is demonstrated along the 
same line as the Riemannian case (see |0h2] for details). 

We can show the functional inequalities in Theorem 16 . 3 1 also in the same way by using 
the directional derivative of (see (16. 2p ) modified into 

liminf^^M^^> / (^ + D^)iV<P)d^. 
40 t Jm \^{p) J 

Precisely, the relative Fisher information of = G Va,c{M, uj) is defined by 

Jm Jm \<^(p) / 

and the (/^-global Poincare inequality means 

du<^[ F(w(^)ydu. 



We also remark that W2{fi, v) in (i) and (ii) of Theorem 16. 31 can be replaced with W2{vi /i) 
since the curvature bound RIca? > K for F is equivalent to that for its reverse The 
above (yj-Talagrand inequality shows the concentration of measures as in Section [TJ where 
the open ball B{A,r) in the definition of the concentration function a{r) is replaced with 

B^{A,r) := \y e M mi dpix^y) < r\ oi B~ {A,r) := \y e M mi dpiy.x) < r\ . 
10.4 Gradient flow of 

As for the gradient flow of H^, due to the lack of the analogue of Theorem l8.lt the argu- 
ment in Section E] is unavailable. Nonetheless, one can apply the discussion in Section 
using a (formal) Finsler structure of the Wasserstein space, and obtain a result corre- 
sponding to Theorem 19.71 We remark that, however, the i^-contraction property (18. ip 
essentially depends on the Riemannian structure and can not be expected in the Finsler 
setting (see |0S2] for details). 

Let {M,F) be compact again. We introduce a Finsler structure of {V{M),W2) simi- 
larly to Section |9l Given /i G V{M), define the tangent space {T^V^F^) at /i by 

F^(V0):= (^j^ F{V(t)fdi^ ' for0GC°^(M), T^V := {V0 | G C~(M)}, 

where the closure was taken with respect to the (Minkowski) norm F^. Then we can 
follow the line of Section IH] up to some computational differences. We denote by £ := 
(£*)^^ : TM — )■ T*M the Legendre transform in the reverse direction. 

Definition 10.6 (Gradient vectors) Given a functional H : V{M) — > (— oo, oo] and 
fi G V{M) with H{fi) < oo, we say that H is differentiable at /i if there is $ G T^V such 
that 

hmsup^^^^^^l— ^< / Ci^)W)df, 



40 ^ Jm 
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along all minimal geodesies {pt)t<^io,i] C V{M) with /iq = p, where pt = iTt)^n and 
7j(x) := exp^(tV0(2;)), and if equality holds for G C°°(M) (with lim^o in place of 
limsup^io)- Such $ is unique if it exists, and then we write VwH{ij) = 

Proposition 10.7 Let (M, w, be compact, admissible and satisfy RIcat^ > and 

Hess\l/ > K on for some G M, and fix fi = pu & Vac{M,u) with /i[M*] = 1 and 
H^p{p) < oo. Then the following are equivalent: 

(I) |V_if^|(/i)<oo, 
(II) p G H^{M) and 

$ = £*(_i^_D$) ^-a.e. 

\ V{p) J 

for some $ G T^jV . 
Moreover, then we have $ = Vw[~H^p]{p) and -F^($) = \V-H^\{p). 
Note that 

$ = r - ^ - Dvl/^ = In^(p) - vl;]) = V[- In^(p) - 

Theorem 10.8 (Gradient flow of if^) Let us suppose that (M, cu, is compact, 

admissible and satisfies RIcat^ > as well as Hess\I^ > K on M* for some K G M, and 
let (/it)tg[o,oo) C V3,c{M,uj) be a continuous curve such that pt[M^] = 1 and H^p{pt) < oo 
for allt > 0. Then {pt)t&{o,oo) is an absolutely continuous curve satisfying 

Pt = Vw[-H^]{pt) e T^,V 

at a.e. t E (0, oo) if and only if {pt)te[o, oo) is a weak solution to the reverse ip-heat equation 
of the form 

|^ = -div.(pV[-K(p)-v^]) (10.3) 

with J^^ F(V[— ln<^(pt)])^ d/itdt < oo for all < to < ti < oo, where pt = Pt^- 

Proof. If fit = Vw[—H^]{fit) a-e. t, then Proposition 1 1 . 71 yields fit = V[—\n^{pt) — ^] G 
T^tV a.e. t. Thus it follows from the continuity equation fl9.2p that 

dwt 



M 



dt 



POO P 

dptdt = - / / Dwt{V[- In^(p) - *]) dptdt 
Jo Jm 



for all w G C^((0, oo) x M), and hence pt weakly solves fll0.3p . Conversely, if pt is a 
weak solution to f ll0.3p . then the same calculation implies that $f = V[— In^(pi) — 
satisfies the continuity equation (19. 2p . and {pt)t£{o,oo) is absolutely continuous. Therefore 
Proposition 110.71 shows pt = = '^w[~H^]{pt) a-e. t. □ 

We meant by the reverse ip-heat equation the equation with respect to the reverse 
Finsler structure ^(v) = F(— v). Since the gradient vector for ^ is written as V^p = 
— V(— p), (110. 3p is indeed rewritten as 

^ = div. (p^K(p) + M/]). 
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A Appendix: Measure concentration via ix^-entropy 
inequality 

Let us go back to the Riemannian situation. In Section [6], we introduced the (/p- logarithmic 
Sobolev inequahty fl6.5p by generahzing the relative entropy to the 99-relative entropy 
associated with the Bregman divergence. Precisely, the classical logarithmic Sobolev 
inequality (corresponding to ipi{s) = s) of the form 



EntM - Ent,(z/) < V 
is generalized to 




Vp Va 



P 



a 



2K 



I 1 2 

V[ln^(p) - In^(o-)] d/i, 



M 



where /i = pu, v = au and K is a. positive constant. 

The logarithmic Sobolev inequality has the alternative form 



w In(ty) di^ — I w di' \ \n 
'M \Jm J 

for nonnegative measurable functions w : M 

1 



u^{w) du — 



M 



w du \ < 



M 



2K 



w du] < [ ^^^^ du 

M J '^KJm w 

[0, 00). Then the inequality 

Xw)\Vw\^ du = — / ^—^ 
M Jm 



du 



obtained by replacing the function r 1 — )• r In r (generating the relative entropy) with 
u^p is called the u^- entropy inequality, which provides a generalization of the logarithmic 
Sobolev inequality different from our (/^-logarithmic Sobolev inequality. The function if is 
usually imposed to be concave, that is equivalent to the convexity of the function 



[s,t) 



d^{s + t, t) := u^{s + t) - u^(t) - ln^(t)s. 



Note that d^ coincides with the density function of the Bregman divergence D^. We 
refer to [Chflj and |Chf2] for details, where instead of it is treated C^-strictly convex 
functions $ such that 1/$" is concave. 

We demonstrated in Section [7] that the yp-Talagrand inequality leads the m(y9)-normal 
concentration of measures. In the classical case of fi{s) = s, it is known that the normal 
concentration also follows from the logarithmic Sobolev inequality by the Herbst argument 
(see, e.g., [Lei Chapter 5]). In the same spirit, we can deduce from the M<^-entropy 
inequality the corresponding (/^-normal concentration of measures. We first recall a kind 
of Chebyshev's inequality for later use. 

Lemma A.l (Chebyshev's inequality) Let w be a measurable function on a measure 
space {X,fi). Then for any nonnegative, non- decreasing, measurable function v on M, 



/i [{x G X I w{x) > t}] < 



vit) 



v{w) dfi 



X 



holds for any t > with v{t) > 0. 
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We next show an auxiliary lemma. We will normalize ip as = 1 for simplicity, 
recall that such a normalization does not change the value of 9^p (Remark 12. lOp . 

Lemma A. 2 Let (f : (0, oo) — > (0, oo) be a positive concave function with ^p{l) = 1. 
Then we have 9^ < 1 and 

u^{s) + a^s > a^ip{s) In^(s) 
for any s > 0, where we set := — > 0. 

Proof. It follows from the concavity of (/? that 

<p{s + t) - <p{s) ^ ^ <^(s) 



t s — e s ~ e 

for any 0<e<s<s + t. Letting e J, and then t J, 0, we find 



■ lim sup 



^{s + t)-^{s) ^ ^ 



Since s > is arbitrary, we obtain 6^ < 1. 

Set A{s) := u^{s) + a^s — a^(p{s)\n^{s) and observe ^4(1) = by the choice of a^. 
Proposition 12.131 implies 

> \imip{s)\nJs) > lim s"^^ £2-5^(5) = 0, 

SO that limsio ip{s) In^(s) = and we can put A[0) := 0. Since the concavity of ip ensures 
that the right derivative 



^+{s) := hm G 

£4,0 s 



if{s) 



s 

is well-defined and non-increasing on (0, 00), a direct computation yields 

A'4s) := lim^ii^±ii^l^ = ln,(.) {l - a.^'^s)} . 
Note that (^M> shows 

/■I /•! 

1 - a^^'^{l) > 1 - a^9^ = 1 + 9^ \n^{t) dt > 1 + 9^ i2-e^{t) dt = 1 - > 0. 

Jo Jo ^ ~ 

For s > 1, we deduce from In^(s) > ln(^(l) = and (p'^{s) < that > 0. 

Hence we have A{s) > A{1) = 0. On (0,1), since In^ < 0, A{0) = A{1) = and (^'+ 
is non-increasing, A is identically zero or there is some Sq G (0, 1) such that A'_^ > on 
(0, So) and that A'_^ < on (sq, 1). Therefore we conclude that A > on (0, 1). □ 



Remark A. 3 The condition 9^ < 1 does not imply the concavity of ip. For instance, let 

: = 

Then we have 9^ = 1, whereas if is clearly not concave. 



^/s for < s < 1, 
s for s > 1. 
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Now we prove that the w^-entropy inequahty imphes the if-noimal concentration for 
(f as in Lemma IA.2I 

Theorem A. 4 (y^-normal concentration from ?i^-entropy inequality) Take a pos- 
itive concave function ip : (0, oo) — y (0, oo) such that (f{l) = 1. For a Riemannian 
manifold {M,g) and v G V{M), assume that there is a positive constant K such that the 
u^- entropy inequality 



u^{w) du — 



M 



wdv] < 



M 



u"(w)\Vw\'^du 



(A.l) 



holds for every nonnegative measurable function w G L}{M^v) satisfying u'^{w)\Ww\^ G 
L^{M, v). Then for any r > we have 



air) 



> exp^ 



where a stands for the concentration function of (M, u) 



Proof Fix arbitrary A C M with i^[A] > 1/2 and r > 0. Putting B := M \ B{A,r), 
we also assume > since we have a{r) = if z/[i?] = for all such A. Set 

Fr{x) := mm{dg{x, A),r} for x G M, and observe that Fr is 1-Lipschitz. Note also that 
the function 



GJx) := FJx) 



Fr du 



M 



satisfies Gr{x) > r/2 for any x E B since jj^^Frdv < r ■ v[M \ A\ < r/2. Applying 
Chebyshev's inequality (Lemma lA.ip to the nonnegative, non-decreasing function 



Vs{t) := exp st 



with s > and := — m<^(1) > 0, we have 



v[B] <v |x G M 



Gr{x) > - 



< 



Vs{r/2) 



Vs{Gr) du. 



(A.2) 



M 



We shall show that /(s) := /^^ Vs{Gr) du > Vs{r/2) du > is bounded above by 1. 
Set 



Ws{x 
X. 



) := Vs{Gr{x)) = exp^ (^sG ri 



X) 



2a^K 



X G M 



sGJx) 



2a.M ^ 



For s G (0, a^Kr) and any x G 5, we have 



sGr(x) 



s rs 
> 



2a^K - 2 2a^K 



2a^K 



a^Kr 



a<„Kr'^ 



>0>l. 



7" 
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proving B C X^. Let us introduce the strictly convex function := u^{t) + a^t on 

[0, oo), and observe that < on [0, 1] and > on (1, oo). Then the inequahty (lA.ip 
apphed to w = Wg can be rewritten as 

^Jw,) du- J- I ^"(w,)\Vw,\'^du w,du] . 



Note that Ws is bounded since Gr is bounded by definition, and hence Wg G L^{M,u). 
Moreover, u'^{ws)\Vws\'^ G L^{M, u) is seen by 

/ ^'^{ws)\Vws\^ du = I s^if{ws)\VGr\^ du < I ip{ws)du 

for s G (0, a^pKr), where we used the fact that | VG^I < 1 on whole M and | VG^I = on 
B. It follows from Lemma [A. 21 that 



a^(p{ws)\n^{ws) - —(p{ws) ] dp 



= j (p{ws) (^sa^Gr - du = sa^^ 



Wsdu ] . 



These together imply, as Ws du = Jj^^ Wg dv = I{s), 

sa^I'is) < for s G {0,a^Kr). (A.3) 

For So G (0, a^Kr) chosen later, set 

1 rK^m),\ 



P(s):=exp - / Q{s 
K^f J so / 

for s G (0,so], and observe 



Then we deduce from f lAlSj) that Q'{s) = if and only if $' (/(s)) = 0. 



Assume that sup^g(Q ^^^j^^-) /(s) > /(O) = 1 and choose Sq G (0, a^Kr) such that /(sq) > 
1 and 

c := sup ^'^{l{s)) e {a^,2a^) 

se{o,so] 

(note that $' (/(O)) = a^). Then we have 



c/a, 

^O-ip J so t J V-^O 



P{s) > exp ( ^ /" \dt\ = f-) ^ s G (0, So]. (A.4) 



Moreover, since the convexity of $^ and $<^(0) = imply J(s)$^(/(s)) > we 
find $'^(/(so)) > and hence (5'(so) < 0. Note that there does not exist s G (0, So) such 
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that Q' < on (s, Sq) as well as Q'{s) = 0, since then /(s)$'^(/(s)) = > $^(/(s)) and 
Q{so) < Q{s) < which contradicts J(so) > 1. Thus Q' < on (0,So), and by f lXtj) 

Q(so) < limsupg(s) < So^"^ limsup 

Now, since 

/'(0)= / aci/^ = 0, $^(J(s)) =$^(l) + s<l>;(l)J'(0) + O(s2) =0(s2) 

and c < 2a<^, it holds lim^io s~'^^"'^$(p(/(s)) = 0. This means Q{so) < and hence 
-^(so) < 1, which is a contradiction. We therefore obtain J(s) < /(O) = 1 for any 
s G (0, a^Kr) as desired. 

Hence we deduce from (lA.2p that ^[3] < Vs{r/2)^^ for any s G (0,a<^i^r). Choosing 
s = a^pKr /2 and taking the supremum in A, we conclude that 

1 

air) < 



□ 



Remark A. 5 Bolley and Gentil |BG] showed that if a probability measure on satisfies 
CD(A', oo) with K > 0, then it satisfies the u^-entropy inequality (lA.ip with the same con- 



stant K. We remark that the condition CD(A', oo) leads the normal concentration which is 
stronger than the </?- normal concentration for 9^ <1 (since exp;^(r)~^ > e2_e^(r)~^ > e"*" 
by fl2.10p ). whereas there exists a probability measure which satisfies flA.ip and does not 
satisfy CD(A', oo). See |L0[ Theorem 2] for details, where they proved that the probability 
measure on M" of the form 



d^la{x) ■= [j^i^j exp(-|a:|")d£"(a;) 

with a G [1,2) satisfies the Wi^^-entropy inequality for m G (1, 2], while the concentration 
function a{r) of /ia is dominated by exp(— r^/3) (resp. exp(— r"/3)) for r < 1 (resp. r > 1). 
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